├── Copyright ├── README.md └── c.rkt /Copyright: -------------------------------------------------------------------------------- 1 | Copyright (c) 2013 Andrew W. Keep 2 | 3 | Permission is hereby granted, free of charge, to any person obtaining a 4 | copy of this software and associated documentation files (the "Software"), 5 | to deal in the Software without restriction, including without limitation 6 | the rights to use, copy, modify, merge, publish, distribute, sublicense, 7 | and/or sell copies of the Software, and to permit persons to whom the 8 | Software is furnished to do so, subject to the following conditions: 9 | 10 | The above copyright notice and this permission notice shall be included in 11 | all copies or substantial portions of the Software. 12 | 13 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 14 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 15 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL 16 | THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 17 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 18 | FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 19 | DEALINGS IN THE SOFTWARE. 20 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | Scheme-to-C 2 | ============ 3 | 4 | This is a tiny nanopass compiler for compiling a small subset of Scheme to C. 5 | It was developed to be presented at Clojure Conj 2013, in Alexandria, VA. 6 | It has been a little over a week and half since I started working on it and 7 | I've added more documentation and more tests since Clojure Conj. 8 | 9 | There is still much to be done. More tests are needed and a few passes are 10 | still not documented, and I still have not tested the boehm collector, though 11 | I hope to do so soon. 12 | 13 | Required Libraries 14 | =================== 15 | There are two required git repositories to run this compiler. This repository 16 | (of course), and the nanopass framework repository at 17 | github.com/akeep/nanopass-framework. 18 | 19 | You will also need one of the supported host compilers listed in the next 20 | section. 21 | 22 | Supported Host Compilers 23 | ========================= 24 | 25 | The current version supports three host compilers: Chez Scheme, Ikarus, and 26 | Vicare. All three share similar features that are needed by the Nanopass 27 | Framework in order to operate. 28 | 29 | Getting Chez Scheme 30 | -------------------- 31 | A free version of the Chez Scheme interpreter, Petite Chez Scheme, is 32 | available for download on http://www.scheme.com/. Generally, it is easiest 33 | to install the non-threaded version that is available for your machine, and 34 | use the 64-bit version, if it is supported on your platform. 35 | 36 | Running on Chez Scheme 37 | ----------------------- 38 | In the `scheme-to-c` directory start `petite` using the `--libdirs` command 39 | line switch to tell `petite` where to find the `nanopass-framework` directory: 40 | 41 | ``` 42 | $ petite --libdirs .: 43 | Petite Chez Scheme Version 8.4 44 | Copyright (c) 1985-2011 Cadence Research Systems 45 | 46 | > (import (c)) 47 | > (my-tiny-compile '(+ 4 5)) 48 | 9 49 | ``` 50 | 51 | You can run the tests as: 52 | 53 | ``` 54 | $ petite --libdirs .: 55 | Petite Chez Scheme Version 8.4 56 | Copyright (c) 1985-2011 Cadence Research Systems 57 | 58 | > (import (tests)) 59 | > (run-tests) 60 | running test 0: 61 | 0 62 | passed 63 | running test 1: 64 | -5 65 | passed 66 | . 67 | . 68 | . 69 | ``` 70 | 71 | Getting Ikarus 72 | --------------- 73 | Ikarus is no longer under active development, but it is easier to install on a 74 | Mac OS X machine than Vicare, so I recommend it if you are on a Mac. You can 75 | find Ikarus at https://launchpad.net/ikarus and you can download it using bzr. 76 | The easiest way to install it is to use the `c64` install script. This installs 77 | it into the `$HOME/.opt64` subdirectory and you can add `$HOME/.opt64/bin` to 78 | the path to run `ikarus`. 79 | 80 | Running on Ikarus 81 | ------------------ 82 | When running on `ikarus` you will also need to add the `nanopass-framework` 83 | directory to the path in order to run the new compiler. This can be done with 84 | an environment variable, but I generally find it easier to do this on the 85 | `ikarus` REPL as follows: 86 | 87 | ``` 88 | $ ikarus 89 | Ikarus Scheme version 0.0.4-rc1+, 64-bit (revision 1870, build 2013-10-16) 90 | Copyright (c) 2006-2009 Abdulaziz Ghuloum 91 | 92 | > (library-path (cons "../nanopass-framework" (library-path))) 93 | > (import (c)) 94 | > (my-tiny-compiler '(+ 4 5)) 95 | 9 96 | ``` 97 | 98 | You can also run the tests through `ikarus` as follows: 99 | 100 | ``` 101 | $ ikarus 102 | Ikarus Scheme version 0.0.4-rc1+, 64-bit (revision 1870, build 2013-10-16) 103 | Copyright (c) 2006-2009 Abdulaziz Ghuloum 104 | 105 | > (library-path (cons "../nanopass-framework" (library-path))) 106 | > (import (tests)) 107 | > (run-tests) 108 | running test 0: 109 | 0 110 | passed 111 | running test 1: 112 | -5 113 | passed 114 | . 115 | . 116 | . 117 | ``` 118 | 119 | Getting Vicare 120 | --------------- 121 | Vicare Scheme is a fork of Ikarus that is currently under development. 122 | You can find Vicare at http://marcomaggi.github.io/vicare.html. This can 123 | be installed using the standard GNU style configure script. The only 124 | features we require is posix support, since we need the `system` call to 125 | shell out to run `gcc`. 126 | 127 | Running on Vicare 128 | ------------------ 129 | Similar to Ikarus and Chez Scheme, Vicare also needs to be informed of where 130 | to find the `nanopass-framework` directory. Vicare also needs to be told to 131 | look for additional Scheme file extensions, since I am using `.ss` instead of 132 | the more recently introduced `.sls` for R6RS Scheme libraries. Here we can 133 | use the `--more-file-extensions` and two calls to the `--search-path` command 134 | line flag, one to search in the `nanopass-framework` directory and one in the 135 | current directory, `.`, which is otherwise not included. 136 | 137 | ``` 138 | $ vicare --more-file-extensions --search-path ../nanopass-framework --search-path . 139 | Vicare Scheme version 0.3d1, 64-bit 140 | Revision no-branch/no-commit 141 | Build 2013-10-15 142 | 143 | Copyright (c) 2006-2010 Abdulaziz Ghuloum and contributors 144 | Copyright (c) 2011-2013 Marco Maggi 145 | 146 | vicare> (import (c)) 147 | vicare> (my-tiny-compile '(+ 4 5)) 148 | 9 149 | ``` 150 | 151 | We can also run the tests under Vicare as: 152 | 153 | ``` 154 | $ vicare --more-file-extensions --search-path ../nanopass-framework --search-path . 155 | Vicare Scheme version 0.3d1, 64-bit 156 | Revision no-branch/no-commit 157 | Build 2013-10-15 158 | 159 | Copyright (c) 2006-2010 Abdulaziz Ghuloum and contributors 160 | Copyright (c) 2011-2013 Marco Maggi 161 | 162 | vicare> (import (tests)) 163 | vicare> (run-tests) 164 | running test 0: 165 | 0 166 | passed 167 | running test 1: 168 | -5 169 | passed 170 | . 171 | . 172 | . 173 | ``` 174 | 175 | More To Do 176 | =========== 177 | 178 | - [x] start better documentation of code 179 | - [x] add tests for the compiler 180 | - [x] test on Chez, Ikarus, and Vicare 181 | - [ ] test the Boehm garbage collector 182 | - [ ] add more and larger tests 183 | - [ ] improve the testing framework to allow for quieter output and errors 184 | - [ ] add predicates for fixnum and procedure 185 | - [ ] finish documentation of code 186 | -------------------------------------------------------------------------------- /c.rkt: -------------------------------------------------------------------------------- 1 | #lang nanopass 2 | ;;; Copyright (c) 2013 Andrew W. Keep 3 | ;;; See the accompanying file Copyright for details 4 | ;;; 5 | ;;; A nanopass compiler developed to use as a demo during Clojure Conj 2013. 6 | ;;; The source language for the compiler is: 7 | ;;; 8 | ;;; Expr --> 9 | ;;; | 10 | ;;; | 11 | ;;; | (quote ) 12 | ;;; | (if ) 13 | ;;; | (if ) 14 | ;;; | (or ...) 15 | ;;; | (and ...) 16 | ;;; | (not ) 17 | ;;; | (begin ... ) 18 | ;;; | (lambda ( ...) ... ) 19 | ;;; | (let ([ ] ...) ... ) 20 | ;;; | (letrec ([ ] ...) ... ) 21 | ;;; | (set! ) 22 | ;;; | ( ...) 23 | ;;; 24 | ;;; Primitive --> car | cdr | cons | pair? | null? | boolean? | make-vector 25 | ;;; | vector-ref | vector-set! | vector? | vector-length | box 26 | ;;; | unbox | set-box! | box? | + | - | * | / | = | < | <= | > 27 | ;;; | >= | eq? 28 | ;;; Var --> symbol 29 | ;;; Const --> #t | #f | '() | integer between -2^60 and 2^60 - 1 30 | ;;; Datum --> | ( . ) | #( ...) 31 | ;;; 32 | ;;; or in nanopass parlance: 33 | ;;; (define-language Lsrc 34 | ;;; (terminals 35 | ;;; (symbol (x)) 36 | ;;; (primitive (pr)) 37 | ;;; (constant (c)) 38 | ;;; (datum (d))) 39 | ;;; (Expr (e body) 40 | ;;; pr 41 | ;;; x 42 | ;;; c 43 | ;;; (quote d) 44 | ;;; (if e0 e1) 45 | ;;; (if e0 e1 e2) 46 | ;;; (or e* ...) 47 | ;;; (and e* ...) 48 | ;;; (not e) 49 | ;;; (begin e* ... e) 50 | ;;; (lambda (x* ...) body* ... body) 51 | ;;; (let ([x* e*] ...) body* ... body) 52 | ;;; (letrec ([x* e*] ...) body* ... body) 53 | ;;; (set! x e) 54 | ;;; (e e* ...))) 55 | ;;; 56 | ;;; The following exports are defined for this library: 57 | ;;; 58 | ;;; (my-tiny-compile ) 59 | ;;; my-tiny-compile is the main interface the compiler, where is 60 | ;;; a quoted expression for the compiler to evaluate. This procedure will 61 | ;;; run the nanopass parts of the compiler, produce a C output file in t.c, 62 | ;;; compile it using gcc to a program t, run the program t, directing its 63 | ;;; output to t.out, and finally use the Scheme reader to read t.out and 64 | ;;; return the value to the host Scheme system. For example, if we wanted 65 | ;;; to run a program that calculates the factorial of 5, we could do the 66 | ;;; following: 67 | ;;; (my-tiny-compile '(letrec ([f (lambda (n) 68 | ;;; (if (= n 0) 69 | ;;; 1 70 | ;;; (* n (f (- n 1)))))]) 71 | ;;; (f 10))) 72 | ;;; 73 | ;;; (trace-passes) 74 | ;;; (trace-passes ) 75 | ;;; trace-passes is a parameter used by my-tiny-compile to decide what 76 | ;;; passees should have their output printed. When it is called without 77 | ;;; any arguments, it returns the list of passes to be traced. When it 78 | ;;; is called with an argument, the argument should be one of the 79 | ;;; following: 80 | ;;; ' - sets this pass to be traced 81 | ;;; '( ...) - set the list of passes to trace 82 | ;;; #t - traces all passes 83 | ;;; #f - turns off trace passing 84 | ;;; 85 | ;;; all-passes 86 | ;;; lists all passes in the compiler. 87 | ;;; 88 | ;;; (use-boehm?) 89 | ;;; (use-boehm? ) 90 | ;;; use-boehm? is a parameter that indicates if the generated C code should 91 | ;;; attempt to use the boehm garbage collector. This feature is, as of 92 | ;;; yet, untested. 93 | ;;; 94 | ;;; Internals that are exported to make them available for programmers 95 | ;;; experimenting with the compiler. 96 | ;;; 97 | ;;; TBD 98 | ;;; 99 | ;;; 100 | (provide Lsrc unparse-Lsrc 101 | L1 unparse-L1 102 | L2 unparse-L2 103 | L3 unparse-L3 104 | L4 unparse-L4 105 | L5 unparse-L5 106 | L6 unparse-L6 107 | L7 unparse-L7 108 | L8 unparse-L8 109 | L9 unparse-L9 110 | L10 unparse-L10 111 | L11 unparse-L11 112 | L12 unparse-L12 113 | L13 unparse-L13 114 | L14 unparse-L14 115 | L15 unparse-L15 116 | L16 unparse-L16 117 | L17 unparse-L17 118 | L18 unparse-L18 119 | L19 unparse-L19 120 | ; L20 unparse-L20 121 | L21 unparse-L21 122 | L22 unparse-L22 123 | 124 | unique-var 125 | 126 | user-alloc-value-prims 127 | user-non-alloc-value-prims 128 | user-pred-prims 129 | user-effect-prims 130 | user-prims 131 | void+user-non-alloc-value-prims 132 | void+user-prims 133 | closure+user-alloc-value-prims 134 | closure+void+user-non-alloc-value-prims 135 | closure+user-effect-prims 136 | internal+closure+user-effect-prims 137 | closure+void+user-prims 138 | 139 | primitive? 140 | void+primitive? 141 | closure+void+primitive? 142 | effect-free-prim? 143 | predicate-primitive? 144 | effect-primitive? 145 | value-primitive? 146 | non-alloc-value-primitive? 147 | effect+internal-primitive? 148 | 149 | target-fixnum? 150 | constant? 151 | datum? 152 | integer-64? 153 | 154 | set-cons 155 | union 156 | difference 157 | intersect 158 | 159 | parse-and-rename 160 | remove-one-armed-if 161 | remove-and-or-not 162 | make-begin-explicit 163 | inverse-eta-raw-primitives 164 | quote-constants 165 | remove-complex-constants 166 | identify-assigned-variables 167 | purify-letrec 168 | optimize-direct-call 169 | find-let-bound-lambdas 170 | remove-anonymous-lambda 171 | convert-assignments 172 | uncover-free 173 | convert-closures 174 | optimize-known-call 175 | expose-closure-prims 176 | lift-lambdas 177 | remove-complex-opera* 178 | recognize-context 179 | expose-allocation-primitives 180 | return-of-set! 181 | flatten-set! 182 | ; push-if 183 | specify-constant-representation 184 | expand-primitives 185 | generate-c 186 | 187 | use-boehm? 188 | 189 | my-tiny-compile 190 | trace-passes 191 | all-passes) 192 | 193 | (require racket/fixnum 194 | racket/pretty 195 | racket/system 196 | (for-syntax syntax/parse 197 | syntax/stx 198 | nanopass/base 199 | racket/syntax)) 200 | 201 | ;;; As of yet untested feature for using the boehm GC 202 | ;;; in the compiled output of our compiler. 203 | (define use-boehm? 204 | (let ([use? #f]) 205 | (case-lambda 206 | [() use?] 207 | [(u?) (set! use? u?)]))) 208 | 209 | ;;; Representation of our data types. 210 | ;;; We use tagged pointers, because all of our pointers are 8-byte aligned, 211 | ;;; leaving te bottom 3 bits always being 0. Using these 3 bits for tags 212 | ;;; lets us store things like fixnums as pointers, and differentiate them 213 | ;;; from pointers like closures and vectors. It also saves us using a word 214 | ;;; for a tag when in our representation of vectors, closures, etc. 215 | (define fixnum-tag #b000) 216 | (define fixnum-mask #b111) 217 | 218 | (define pair-tag #b001) 219 | (define pair-mask #b111) 220 | 221 | (define box-tag #b010) 222 | (define box-mask #b111) 223 | 224 | (define vector-tag #b011) 225 | (define vector-mask #b111) 226 | 227 | (define closure-tag #b100) 228 | (define closure-mask #b111) 229 | 230 | ;;; NOTE: #b101 is used for constants 231 | 232 | (define boolean-tag #b1101) 233 | (define boolean-mask #b1111) 234 | 235 | (define true-rep #b111101) 236 | (define false-rep #b101101) 237 | 238 | (define null-rep #b100101) 239 | (define void-rep #b110101) 240 | 241 | (define fixnum-shift 3) 242 | (define word-size 8) 243 | 244 | 245 | 246 | ;;; Helper function for representing unique variables as symbols by adding a 247 | ;;; number to the variables (so if we start with f we get f.n where n might 248 | ;;; be 1, 2, 3, etc, and is unique). 249 | (define unique-var 250 | (let () 251 | (define count 0) 252 | (lambda (name) 253 | (let ([c count]) 254 | (set! count (+ count 1)) 255 | (string->symbol 256 | (string-append (symbol->string name) "." (number->string c))))))) 257 | 258 | ;; strip the numberic bit back off the unique-var 259 | (define base-var 260 | (lambda (x) 261 | (define s0 262 | (lambda (rls) 263 | (if (null? rls) 264 | (error 'base-var "not a unique-var created variable ~a" x) 265 | (let ([c (car rls)]) 266 | (cond 267 | [(char-numeric? c) (s1 (cdr rls))] 268 | [else (error 'base-var 269 | "not a unique-var created variable ~a" x)]))))) 270 | (define s1 271 | (lambda (rls) 272 | (if (null? rls) 273 | (error 'base-var "not a unique-var created variable ~a" x) 274 | (let ([c (car rls)]) 275 | (cond 276 | [(char-numeric? c) (s1 (cdr rls))] 277 | [(char=? c #\.) (cdr rls)] 278 | [else (error 'base-var 279 | "not a unique-var created variable ~a" x)]))))) 280 | (string->symbol 281 | (list->string 282 | (reverse 283 | (s0 (reverse (string->list (symbol->string x))))))))) 284 | 285 | 286 | ;;; Convenience procedure for building temporaries in the compiler. 287 | (define make-tmp (lambda () (unique-var 't))) 288 | 289 | ;;; Helpers for the various sets of primitives we have over the course of the 290 | ;;; compiler 291 | ;;; All primitives: 292 | ;;; 293 | ;;; | | | Langauge | Language | 294 | ;;; primitive | arity | context | introduced | removed | 295 | ;;; --------------------+-------+---------+------------+----------+ 296 | ;;; cons | 2 | value | Lsrc | L17 | 297 | ;;; make-vector | 1 | value | Lsrc | L17 | 298 | ;;; box | 1 | value | Lsrc | L17 | 299 | ;;; car | 1 | value | Lsrc | L22 | 300 | ;;; cdr | 1 | value | Lsrc | L22 | 301 | ;;; vector-ref | 2 | value | Lsrc | L22 | 302 | ;;; vector-length | 1 | value | Lsrc | L22 | 303 | ;;; unbox | 1 | value | Lsrc | L22 | 304 | ;;; + | 2 | value | Lsrc | L22 | 305 | ;;; - | 2 | value | Lsrc | L22 | 306 | ;;; * | 2 | value | Lsrc | L22 | 307 | ;;; / | 2 | value | Lsrc | L22 | 308 | ;;; pair? | 1 | pred | Lsrc | L22 | 309 | ;;; null? | 1 | pred | Lsrc | L22 | 310 | ;;; boolean? | 1 | pred | Lsrc | L22 | 311 | ;;; vector? | 1 | pred | Lsrc | L22 | 312 | ;;; box? | 1 | pred | Lsrc | L22 | 313 | ;;; = | 2 | pred | Lsrc | L22 | 314 | ;;; < | 2 | pred | Lsrc | L22 | 315 | ;;; <= | 2 | pred | Lsrc | L22 | 316 | ;;; > | 2 | pred | Lsrc | L22 | 317 | ;;; >= | 2 | pred | Lsrc | L22 | 318 | ;;; eq? | 2 | pred | Lsrc | L22 | 319 | ;;; vector-set! | 3 | effect | Lsrc | L22 | 320 | ;;; set-box! | 2 | effect | Lsrc | L22 | 321 | ;;; --------------------+-------+---------+------------+----------+ 322 | ;;; void | 0 | value | L1 | L22 | 323 | ;;; --------------------+-------+---------+------------+----------+ 324 | ;;; make-closure | 1 | value | L13 | L17 | 325 | ;;; closure-code | 2 | value | L13 | L22 | 326 | ;;; closure-ref | 2 | value | L13 | L22 | 327 | ;;; closure-code-set! | 2 | effect | L13 | L22 | 328 | ;;; closure-data-set! | 3 | effect | L13 | L22 | 329 | ;;; --------------------+-------+---------+------------+----------+ 330 | ;;; $vector-length-set! | 2 | effect | L17 | L22 | 331 | ;;; $set-car! | 2 | effect | L17 | L22 | 332 | ;;; $set-cdr! | 2 | effect | L17 | L22 | 333 | ;;; 334 | ;;; This is a slightly cleaned up version, but this might still be better 335 | ;;; cleaned up by adding a define-primitives form, perhaps even one that can 336 | ;;; be used in the later parts of the compiler. 337 | 338 | ;;; user value primitives that perform allocation 339 | (define user-alloc-value-prims 340 | '((cons . 2) (make-vector . 1) (box . 1))) 341 | 342 | ;;; user value primitives that do not perform allocation 343 | (define user-non-alloc-value-prims 344 | '((car . 1) (cdr . 1) (vector-ref . 2) (vector-length . 1) (unbox . 1) 345 | (+ . 2) (- . 2) (* . 2) (/ . 2))) 346 | 347 | ;;; user predicate primitives 348 | ;;; TODO: add procedure? 349 | (define user-pred-prims 350 | '((pair? . 1) (null? . 1) (boolean? . 1) (vector? . 1) (box? . 1) (= . 2) 351 | (< . 2) (<= . 2) (> . 2) (>= . 2) (eq? . 2))) 352 | 353 | ;;; user effect primitives 354 | (define user-effect-prims 355 | '((vector-set! . 3) (set-box! . 2))) 356 | 357 | ;;; an association list with the user primitives 358 | (define user-prims 359 | (append user-alloc-value-prims user-non-alloc-value-prims user-pred-prims 360 | user-effect-prims)) 361 | 362 | ;;; void primitive + non-allocation user value primitives 363 | (define void+user-non-alloc-value-prims 364 | (cons '(void . 0) user-non-alloc-value-prims)) 365 | 366 | ;;; an association list with void and all the user primitives 367 | (define void+user-prims 368 | (append user-alloc-value-prims void+user-non-alloc-value-prims 369 | user-pred-prims user-effect-prims)) 370 | 371 | ;;; all the allocation value primitives, including make-closure primitive 372 | (define closure+user-alloc-value-prims 373 | (cons '(make-closure . 1) user-alloc-value-prims)) 374 | 375 | ;;; all the non-allocation value primitives, include the closure primitives 376 | (define closure+void+user-non-alloc-value-prims 377 | (list* '(closure-code . 2) '(closure-ref . 2) 378 | void+user-non-alloc-value-prims)) 379 | 380 | ;; all the user effect primitives with the closure primitives 381 | (define closure+user-effect-prims 382 | (list* '(closure-code-set! . 2) '(closure-data-set! . 3) 383 | user-effect-prims)) 384 | 385 | ;; all the user effect primitives, closure primitives, and internal primitives 386 | (define internal+closure+user-effect-prims 387 | (list* '($vector-length-set! . 2) '($set-car! . 2) '($set-cdr! . 2) 388 | closure+user-effect-prims)) 389 | 390 | ;; association list including all prims except the three final internal 391 | ;; primitives 392 | (define closure+void+user-prims 393 | (append closure+user-alloc-value-prims 394 | closure+void+user-non-alloc-value-prims user-pred-prims 395 | closure+user-effect-prims)) 396 | 397 | ;;; various predicates for determining if a primitve is a valid prim. 398 | (define primitive? 399 | (lambda (x) 400 | (assq x user-prims))) 401 | 402 | (define void+primitive? 403 | (lambda (x) 404 | (assq x void+user-prims))) 405 | 406 | (define closure+void+primitive? 407 | (lambda (x) 408 | (assq x closure+void+user-prims))) 409 | 410 | (define effect-free-prim? 411 | (lambda (x) 412 | (assq x (append void+user-non-alloc-value-prims user-alloc-value-prims 413 | user-pred-prims)))) 414 | 415 | (define predicate-primitive? 416 | (lambda (x) 417 | (assq x user-pred-prims))) 418 | 419 | (define effect-primitive? 420 | (lambda (x) 421 | (assq x closure+user-effect-prims))) 422 | 423 | (define value-primitive? 424 | (lambda (x) 425 | (assq x (append closure+user-alloc-value-prims 426 | closure+void+user-non-alloc-value-prims)))) 427 | 428 | (define non-alloc-value-primitive? 429 | (lambda (x) 430 | (assq x closure+void+user-non-alloc-value-prims))) 431 | 432 | (define effect+internal-primitive? 433 | (lambda (x) 434 | (assq x internal+closure+user-effect-prims))) 435 | 436 | ;;;;;;;;;; 437 | ;;; Helper functions for identifying terminals in the nanopass languages. 438 | 439 | ;;; determine if we have a 61-bit signed integer 440 | (define target-fixnum? 441 | (lambda (x) 442 | (and (and (integer? x) (exact? x)) 443 | (<= (- (expt 2 60)) x (- (expt 2 60) 1))))) 444 | 445 | ;;; determine if we have a constant: #t, #f, '(), or 61-bit signed integer. 446 | (define constant? 447 | (lambda (x) 448 | (or (target-fixnum? x) (boolean? x) (null? x)))) 449 | 450 | ;;; determine if we have a valid datum (a constant, a pair of datum, or a 451 | ;;; vector of datum) 452 | (define datum? 453 | (lambda (x) 454 | (or (constant? x) 455 | (and (pair? x) (datum? (car x)) (datum? (cdr x))) 456 | (and (vector? x) 457 | (let loop ([i (vector-length x)]) 458 | (or (fx= i 0) 459 | (let ([i (fx- i 1)]) 460 | (and (datum? (vector-ref x i)) 461 | (loop i))))))))) 462 | 463 | ;;; determine if we have a 64-bit signed integer (used later in the compiler 464 | ;;; to hold the ptr representation). 465 | (define integer-64? 466 | (lambda (x) 467 | (and (and (integer? x) (exact? x)) 468 | (<= (- (expt 2 63)) x (- (expt 2 63) 1))))) 469 | 470 | ;;; Random helper available on most Scheme systems, but irritatingly not in 471 | ;;; the R6RS standard. 472 | (define make-list 473 | (case-lambda 474 | [(n) (make-list n (void))] 475 | [(n v) (let loop ([n n] [ls '()]) 476 | (if (zero? n) 477 | ls 478 | (loop (fx- n 1) (cons v ls))))])) 479 | 480 | ;;;;;;;; 481 | ;;; The standard (not very efficient) Scheme representation of sets as lists 482 | 483 | ;;; add an item to a set 484 | (define set-cons 485 | (lambda (x set) 486 | (if (memq x set) 487 | set 488 | (cons x set)))) 489 | 490 | ;;; construct the intersection of 0 to n sets 491 | (define intersect 492 | (lambda set* 493 | (if (null? set*) 494 | '() 495 | (foldl (lambda (setb seta) 496 | (let loop ([seta seta] [fset '()]) 497 | (if (null? seta) 498 | fset 499 | (let ([a (car seta)]) 500 | (if (memq a setb) 501 | (loop (cdr seta) (cons a fset)) 502 | (loop (cdr seta) fset)))))) 503 | (car set*) (cdr set*))))) 504 | 505 | ;;; construct the union of 0 to n sets 506 | (define union 507 | (lambda set* 508 | (if (null? set*) 509 | '() 510 | (foldl (lambda (setb seta) 511 | (let loop ([setb setb] [seta seta]) 512 | (if (null? setb) 513 | seta 514 | (loop (cdr setb) (set-cons (car setb) seta))))) 515 | (car set*) (cdr set*))))) 516 | 517 | ;;; construct the difference of 0 to n sets 518 | (define difference 519 | (lambda set* 520 | (if (null? set*) 521 | '() 522 | (foldr (lambda (setb seta) 523 | (let loop ([seta seta] [final '()]) 524 | (if (null? seta) 525 | final 526 | (let ([a (car seta)]) 527 | (if (memq a setb) 528 | (loop (cdr seta) final) 529 | (loop (cdr seta) (cons a final))))))) 530 | (car set*) (cdr set*))))) 531 | 532 | ;;; Language definitions for Lsrc and L1 to L22 533 | ;;; Both the language extension and the fully specified language is included 534 | ;;; (though the fully specified language may be out of date). This can be 535 | ;;; regenerated by doing: 536 | ;;; > (import (c)) 537 | ;;; > (import (nanopass)) 538 | ;;; > (language->s-expression L10) => generates L10 definition 539 | (define-language Lsrc 540 | (terminals 541 | (symbol (x)) 542 | (primitive (pr)) 543 | (constant (c)) 544 | (datum (d))) 545 | (Expr (e body) 546 | pr 547 | x 548 | c 549 | (quote d) 550 | (if e0 e1) 551 | (if e0 e1 e2) 552 | (or e* ...) 553 | (and e* ...) 554 | (not e) 555 | (begin e* ... e) 556 | (lambda (x* ...) body* ... body) 557 | (let ([x* e*] ...) body* ... body) 558 | (letrec ([x* e*] ...) body* ... body) 559 | (set! x e) 560 | (e e* ...))) 561 | 562 | ;;; Language 1: removes one-armed if and adds the void primitive 563 | ; 564 | ; (define-language L1 565 | ; (terminals (void+primitive (pr)) 566 | ; (symbol (x)) 567 | ; (constant (c)) 568 | ; (datum (d))) 569 | ; (Expr (e body) 570 | ; pr 571 | ; x 572 | ; c 573 | ; (quote d) 574 | ; (if e0 e1 e2) 575 | ; (or e* ...) 576 | ; (and e* ...) 577 | ; (not e) 578 | ; (begin e* ... e) 579 | ; (lambda (x* ...) body* ... body) 580 | ; (let ([x* e*] ...) body* ... body) 581 | ; (letrec ([x* e*] ...) body* ... body) 582 | ; (set! x e) 583 | ; (e e* ...))) 584 | ; 585 | (define-language L1 586 | (extends Lsrc) 587 | (terminals 588 | (- (primitive (pr))) 589 | (+ (void+primitive (pr)))) 590 | (Expr (e body) 591 | (- (if e0 e1)))) 592 | 593 | ;;; Language 2: removes or, and, and not forms 594 | ; 595 | ; (define-language L2 596 | ; (terminals (void+primitive (pr)) 597 | ; (symbol (x)) 598 | ; (constant (c)) 599 | ; (datum (d))) 600 | ; (Expr (e body) 601 | ; pr 602 | ; x 603 | ; c 604 | ; (quote d) 605 | ; (if e0 e1 e2) 606 | ; (begin e* ... e) 607 | ; (lambda (x* ...) body* ... body) 608 | ; (let ([x* e*] ...) body* ... body) 609 | ; (letrec ([x* e*] ...) body* ... body) 610 | ; (set! x e) 611 | ; (e e* ...))) 612 | ; 613 | (define-language L2 614 | (extends L1) 615 | (Expr (e body) 616 | (- (or e* ...) 617 | (and e* ...) 618 | (not e)))) 619 | 620 | ;;; Language 3: removes multiple expressions from the body of lambda, let, 621 | ;;; and letrec (to be replaced with a single begin expression that contains 622 | ;;; the expressions from the body). 623 | ; 624 | ; (define-language L3 625 | ; (terminals (void+primitive (pr)) 626 | ; (symbol (x)) 627 | ; (constant (c)) 628 | ; (datum (d))) 629 | ; (Expr (e body) 630 | ; (letrec ([x* e*] ...) body) 631 | ; (let ([x* e*] ...) body) 632 | ; (lambda (x* ...) body) 633 | ; pr 634 | ; x 635 | ; c 636 | ; (quote d) 637 | ; (if e0 e1 e2) 638 | ; (begin e* ... e) 639 | ; (set! x e) 640 | ; (e e* ...))) 641 | ; 642 | (define-language L3 643 | (extends L2) 644 | (Expr (e body) 645 | (- (lambda (x* ...) body* ... body) 646 | (let ([x* e*] ...) body* ... body) 647 | (letrec ([x* e*] ...) body* ... body)) 648 | (+ (lambda (x* ...) body) 649 | (let ([x* e*] ...) body) 650 | (letrec ([x* e*] ...) body)))) 651 | 652 | ;;; Language 4: removes raw primitives (to be replaced with a lambda and a 653 | ;;; primitive call). 654 | ; 655 | ; (define-language L4 656 | ; (terminals (void+primitive (pr)) 657 | ; (symbol (x)) 658 | ; (constant (c)) 659 | ; (datum (d))) 660 | ; (Expr (e body) 661 | ; (primcall pr e* ...) 662 | ; (letrec ([x* e*] ...) body) 663 | ; (let ([x* e*] ...) body) 664 | ; (lambda (x* ...) body) 665 | ; x 666 | ; c 667 | ; (quote d) 668 | ; (if e0 e1 e2) 669 | ; (begin e* ... e) 670 | ; (set! x e) 671 | ; (e e* ...))) 672 | ; 673 | (define-language L4 674 | (extends L3) 675 | (Expr (e body) 676 | (- pr) 677 | (+ (primcall pr e* ...) => (pr e* ...)))) 678 | 679 | ;;; Language 5: removes raw constants (to be replaced with quoted constant). 680 | ; 681 | ; (define-language L5 682 | ; (terminals 683 | ; (void+primitive (pr)) 684 | ; (symbol (x)) 685 | ; (datum (d))) 686 | ; (Expr (e body) 687 | ; (primcall pr e* ...) 688 | ; (letrec ([x* e*] ...) body) 689 | ; (let ([x* e*] ...) body) 690 | ; (lambda (x* ...) body) 691 | ; x 692 | ; (quote d) 693 | ; (if e0 e1 e2) 694 | ; (begin e* ... e) 695 | ; (set! x e) 696 | ; (e e* ...))) 697 | ; 698 | (define-language L5 699 | (extends L4) 700 | (terminals 701 | (- (constant (c)))) 702 | (Expr (e body) 703 | (- c))) 704 | 705 | ;;; Language 6: removes quoted datum (to be replaced with explicit calls to 706 | ;;; cons and make-vector+vector-set!). 707 | ; 708 | ; (define-language L6 709 | ; (terminals 710 | ; (constant (c)) 711 | ; (void+primitive (pr)) 712 | ; (symbol (x))) 713 | ; (Expr (e body) 714 | ; (quote c) 715 | ; (primcall pr e* ...) 716 | ; (letrec ([x* e*] ...) body) 717 | ; (let ([x* e*] ...) body) 718 | ; (lambda (x* ...) body) 719 | ; x 720 | ; (if e0 e1 e2) 721 | ; (begin e* ... e) 722 | ; (set! x e) 723 | ; (e e* ...))) 724 | ; 725 | (define-language L6 726 | (extends L5) 727 | (terminals 728 | (- (datum (d))) 729 | (+ (constant (c)))) 730 | (Expr (e body) 731 | (- (quote d)) 732 | (+ (quote c)))) 733 | 734 | ;;; Language 7: adds a listing of assigned variables to the body of the 735 | ;;; binding forms: let, letrec, and lambda. 736 | ; (define-language L7 737 | ; (terminals 738 | ; (symbol (x a)) 739 | ; (constant (c)) 740 | ; (void+primitive (pr))) 741 | ; (Expr (e body) 742 | ; (letrec ([x* e*] ...) abody) 743 | ; (let ([x* e*] ...) abody) 744 | ; (lambda (x* ...) abody) 745 | ; (quote c) 746 | ; (primcall pr e* ...) 747 | ; x 748 | ; (if e0 e1 e2) 749 | ; (begin e* ... e) 750 | ; (set! x e) 751 | ; (e e* ...)) 752 | ; (AssignedBody (abody) 753 | ; (assigned (a* ...) body))) 754 | ; 755 | (define-language L7 756 | (extends L6) 757 | (terminals 758 | (- (symbol (x))) 759 | (+ (symbol (x a)))) 760 | (Expr (e body) 761 | (- (lambda (x* ...) body) 762 | (let ([x* e*] ...) body) 763 | (letrec ([x* e*] ...) body)) 764 | (+ (lambda (x* ...) abody) 765 | (let ([x* e*] ...) abody) 766 | (letrec ([x* e*] ...) abody))) 767 | (AssignedBody (abody) 768 | (+ (assigned (a* ...) body)))) 769 | 770 | ;;; Language 8: letrec binding is changed to only bind variables to lambdas. 771 | ; 772 | ; (define-language L8 773 | ; (terminals (symbol (x a)) 774 | ; (constant (c)) 775 | ; (void+primitive (pr))) 776 | ; (Expr (e body) 777 | ; (letrec ([x* le*] ...) body) 778 | ; le 779 | ; (let ([x* e*] ...) abody) 780 | ; (quote c) 781 | ; (primcall pr e* ...) 782 | ; x 783 | ; (if e0 e1 e2) 784 | ; (begin e* ... e) 785 | ; (set! x e) 786 | ; (e e* ...)) 787 | ; (AssignedBody (abody) 788 | ; (assigned (a* ...) body)) 789 | ; (LambdaExpr (le) 790 | ; (lambda (x* ...) abody))) 791 | ; 792 | (define-language L8 793 | (extends L7) 794 | (Expr (e body) 795 | (- (lambda (x* ...) abody) 796 | (letrec ([x* e*] ...) abody)) 797 | (+ le 798 | (letrec ([x* le*] ...) body))) 799 | (LambdaExpr (le) 800 | (+ (lambda (x* ...) abody)))) 801 | 802 | ;;; Language 9: removes lambda expressions from expression context, 803 | ;;; effectively meaning we can only have lambdas bound in the right-hand-side 804 | ;;; of letrec expressions. 805 | ; 806 | ; (define-language L9 807 | ; (terminals 808 | ; (symbol (x a)) 809 | ; (constant (c)) 810 | ; (void+primitive (pr))) 811 | ; (Expr (e body) 812 | ; (letrec ([x* le*] ...) body) 813 | ; (let ([x* e*] ...) abody) 814 | ; (quote c) 815 | ; (primcall pr e* ...) 816 | ; x 817 | ; (if e0 e1 e2) 818 | ; (begin e* ... e) 819 | ; (set! x e) 820 | ; (e e* ...)) 821 | ; (AssignedBody (abody) 822 | ; (assigned (a* ...) body)) 823 | ; (LambdaExpr (le) 824 | ; (lambda (x* ...) abody))) 825 | ; 826 | (define-language L9 827 | (extends L8) 828 | (Expr (e body) 829 | (- le))) 830 | 831 | ;;; Language 10: removes set! and assigned bodies (to be replaced by set-box! 832 | ;;; primcall for set!, and unbox primcall for references of assigned variables). 833 | ; 834 | ; (define-language L10 835 | ; (terminals 836 | ; (symbol (x)) 837 | ; (constant (c)) 838 | ; (void+primitive (pr))) 839 | ; (Expr (e body) 840 | ; (let ([x* e*] ...) body) 841 | ; (letrec ([x* le*] ...) body) 842 | ; (quote c) 843 | ; (primcall pr e* ...) 844 | ; x 845 | ; (if e0 e1 e2) 846 | ; (begin e* ... e) 847 | ; (e e* ...)) 848 | ; (LambdaExpr (le) 849 | ; (lambda (x* ...) body))) 850 | ; 851 | (define-language L10 852 | (extends L9) 853 | (terminals 854 | (- (symbol (x a))) 855 | (+ (symbol (x)))) 856 | (Expr (e body) 857 | (- (let ([x* e*] ...) abody) 858 | (set! x e)) 859 | (+ (let ([x* e*] ...) body))) 860 | (LambdaExpr (le) 861 | (- (lambda (x* ...) abody)) 862 | (+ (lambda (x* ...) body))) 863 | (AssignedBody (abody) 864 | (- (assigned (a* ...) body)))) 865 | 866 | ;;; Language 11: add a list of free variables to the body of lambda 867 | ;;; expressions (starting closure conversion code). 868 | ; 869 | ; (define-language L11 870 | ; (terminals 871 | ; (symbol (x f)) 872 | ; (constant (c)) 873 | ; (void+primitive (pr))) 874 | ; (Expr (e body) 875 | ; (let ([x* e*] ...) body) 876 | ; (letrec ([x* le*] ...) body) 877 | ; (quote c) 878 | ; (primcall pr e* ...) 879 | ; x 880 | ; (if e0 e1 e2) 881 | ; (begin e* ... e) 882 | ; (e e* ...)) 883 | ; (LambdaExpr (le) 884 | ; (lambda (x* ...) fbody)) 885 | ; (FreeBody (fbody) 886 | ; (free (f* ...) body))) 887 | ; 888 | (define-language L11 889 | (extends L10) 890 | (terminals 891 | (- (symbol (x))) 892 | (+ (symbol (x f)))) 893 | (LambdaExpr (le) 894 | (- (lambda (x* ...) body)) 895 | (+ (lambda (x* ...) fbody))) 896 | (FreeBody (fbody) 897 | (+ (free (f* ...) body)))) 898 | 899 | ;;; Language L12: removes the letrec form and adds closure and labels forms 900 | ;;; to replace it. The closure form binds a variable to a label (code 901 | ;;; pointer) and its set of free variables, and the labels form binds labels 902 | ;;; (code pointer) to lambda expressions. 903 | ; 904 | ; (define-language L12 905 | ; (terminals 906 | ; (symbol (x f l)) 907 | ; (constant (c)) 908 | ; (void+primitive (pr))) 909 | ; (Expr (e body) 910 | ; (label l) 911 | ; (closures ((x* l* f** ...) ...) lbody) 912 | ; (let ([x* e*] ...) body) 913 | ; (quote c) 914 | ; (primcall pr e* ...) 915 | ; x 916 | ; (if e0 e1 e2) 917 | ; (begin e* ... e) 918 | ; (e e* ...)) 919 | ; (LambdaExpr (le) 920 | ; (lambda (x* ...) fbody)) 921 | ; (FreeBody (fbody) 922 | ; (free (f* ...) body)) 923 | ; (LabelsBody (lbody) 924 | ; (labels ([l* le*] ...) body))) 925 | ; 926 | (define-language L12 927 | (extends L11) 928 | (terminals 929 | (- (symbol (x f))) 930 | (+ (symbol (x f l)))) 931 | (Expr (e body) 932 | (- (letrec ([x* le*] ...) body)) 933 | (+ (closures ([x* l* f** ...] ...) lbody) 934 | (label l))) 935 | (LabelsBody (lbody) 936 | (+ (labels ([l* le*] ...) body)))) 937 | 938 | ;;; Language 13: finishes closure conversion, removes the closures form, 939 | ;;; replacing it with primitive calls to deal with closure objects, and 940 | ;;; raises the labels from into the Expr non-terminal. 941 | ; 942 | ; (define-language L13 943 | ; (terminals 944 | ; (closure+void+primitive (pr)) 945 | ; (symbol (x f l)) 946 | ; (constant (c))) 947 | ; (Expr (e body) 948 | ; (labels ([l* le*] ...) body) 949 | ; (label l) 950 | ; (let ([x* e*] ...) body) 951 | ; (quote c) 952 | ; (primcall pr e* ...) 953 | ; x 954 | ; (if e0 e1 e2) 955 | ; (begin e* ... e) 956 | ; (e e* ...)) 957 | ; (LambdaExpr (le) 958 | ; (lambda (x* ...) body))) 959 | ; 960 | (define-language L13 961 | (extends L12) 962 | (terminals 963 | (- (void+primitive (pr))) 964 | (+ (closure+void+primitive (pr)))) 965 | (Expr (e body) 966 | (- (closures ([x* l* f** ...] ...) lbody)) 967 | (+ (labels ([l* le*] ...) body))) 968 | (LabelsBody (lbody) 969 | (- (labels ([l* le*] ...) body))) 970 | (LambdaExpr (le) 971 | (- (lambda (x* ...) fbody)) 972 | (+ (lambda (x* ...) body))) 973 | (FreeBody (fbody) 974 | (- (free (f* ...) body)))) 975 | 976 | ;;; Language 14: removes labels form from the Expr nonterminal and puts a 977 | ;;; single labels form at the top. Essentially this raises all of our 978 | ;;; closure converted functions to the top. 979 | ; 980 | ; (define-language L14 981 | ; (entry Program) 982 | ; (terminals 983 | ; (closure+void+primitive (pr)) 984 | ; (symbol (x f l)) 985 | ; (constant (c))) 986 | ; (Expr (e body) 987 | ; (label l) 988 | ; (let ([x* e*] ...) body) 989 | ; (quote c) 990 | ; (primcall pr e* ...) 991 | ; x 992 | ; (if e0 e1 e2) 993 | ; (begin e* ... e) 994 | ; (e e* ...)) 995 | ; (LambdaExpr (le) 996 | ; (lambda (x* ...) body)) 997 | ; (Program (p) 998 | ; (labels ([l* le*] ...) l))) 999 | ; 1000 | (define-language L14 1001 | (extends L13) 1002 | (entry Program) 1003 | (Program (p) 1004 | (+ (labels ([l* le*] ...) l))) 1005 | (Expr (e body) 1006 | (- (labels ([l* le*] ...) body)))) 1007 | 1008 | ;;; Language 15: moves simple expressions (constants and variable references) 1009 | ;;; out of the Expr nonterminal, and replaces expressions referred to in 1010 | ;;; calls and primcalls with simple expressions. This effectively removes 1011 | ;;; complex operands to calls and primcalls. 1012 | ; 1013 | ; (define-language L15 1014 | ; (entry Program) 1015 | ; (terminals 1016 | ; (closure+void+primitive (pr)) 1017 | ; (symbol (x f l)) 1018 | ; (constant (c))) 1019 | ; (Expr (e body) 1020 | ; (se se* ...) 1021 | ; (primcall pr se* ...) 1022 | ; se 1023 | ; (label l) 1024 | ; (let ([x* e*] ...) body) 1025 | ; (if e0 e1 e2) 1026 | ; (begin e* ... e)) 1027 | ; (LambdaExpr (le) 1028 | ; (lambda (x* ...) body)) 1029 | ; (Program (p) 1030 | ; (labels ([l* le*] ...) l)) 1031 | ; (SimpleExpr (se) 1032 | ; x 1033 | ; (quote c))) 1034 | ; 1035 | (define-language L15 1036 | (extends L14) 1037 | (Expr (e body) 1038 | (- x 1039 | (quote c) 1040 | (label l) 1041 | (primcall pr e* ...) 1042 | (e e* ...)) 1043 | (+ se 1044 | (primcall pr se* ...) => (pr se* ...) 1045 | (se se* ...))) 1046 | (SimpleExpr (se) 1047 | (+ x 1048 | (label l) 1049 | (quote c)))) 1050 | 1051 | ;;; Language 16: separates the Expr nonterminal into the Value, Effect, and 1052 | ;;; Predicate nonterminals. This is needed to translate from our expression 1053 | ;;; language into a language like C that has statements (effects) and 1054 | ;;; expressions (values) and predicates that need to be simply values. 1055 | (define-language L16 1056 | (terminals 1057 | (symbol (x l)) 1058 | (value-primitive (vpr)) 1059 | (effect-primitive (epr)) 1060 | (predicate-primitive (ppr)) 1061 | (constant (c))) 1062 | (Program (prog) 1063 | (labels ([l* le*] ...) l)) 1064 | (LambdaExpr (le) 1065 | (lambda (x* ...) body)) 1066 | (SimpleExpr (se) 1067 | x 1068 | (label l) 1069 | (quote c)) 1070 | (Value (v body) 1071 | se 1072 | (if p0 v1 v2) 1073 | (begin e* ... v) 1074 | (let ([x* v*] ...) body) 1075 | (primcall vpr se* ...) => (vpr se* ...) 1076 | (se se* ...)) 1077 | (Effect (e) 1078 | (nop) 1079 | (if p0 e1 e2) 1080 | (begin e* ... e) 1081 | (let ([x* v*] ...) e) 1082 | (primcall epr se* ...) => (epr se* ...) 1083 | (se se* ...)) 1084 | (Predicate (p) 1085 | (true) 1086 | (false) 1087 | (if p0 p1 p2) 1088 | (begin e* ... p) 1089 | (let ([x* v*] ...) p) 1090 | (primcall ppr se* ...) => (ppr se* ...))) 1091 | 1092 | ;;; Language 17: removes the allocation primitives: cons, box, make-vector, 1093 | ;;; and make-closure and adds a generic alloc form for specifying allocation. It 1094 | ;;; also adds raw integers for specifying type tags in the alloc form. 1095 | ; 1096 | ; (define-language L17 1097 | ; (entry Program) 1098 | ; (terminals 1099 | ; (integer-64 (i)) 1100 | ; (effect+internal-primitive (epr)) 1101 | ; (non-alloc-value-primitive (vpr)) 1102 | ; (symbol (x l)) 1103 | ; (predicate-primitive (ppr)) 1104 | ; (constant (c))) 1105 | ; (Program (prog) 1106 | ; (labels ([l* le*] ...) l)) 1107 | ; (LambdaExpr (le) 1108 | ; (lambda (x* ...) body)) 1109 | ; (SimpleExpr (se) 1110 | ; x 1111 | ; (label l) 1112 | ; (quote c)) 1113 | ; (Value (v body) 1114 | ; (alloc i se) 1115 | ; se 1116 | ; (if p0 v1 v2) 1117 | ; (begin e* ... v) 1118 | ; (let ([x* v*] ...) body) 1119 | ; (primcall vpr se* ...) 1120 | ; (se se* ...)) 1121 | ; (Effect (e) 1122 | ; (nop) 1123 | ; (if p0 e1 e2) 1124 | ; (begin e* ... e) 1125 | ; (let ([x* v*] ...) e) 1126 | ; (primcall epr se* ...) 1127 | ; (se se* ...)) 1128 | ; (Predicate (p) 1129 | ; (true) 1130 | ; (false) 1131 | ; (if p0 p1 p2) 1132 | ; (begin e* ... p) 1133 | ; (let ([x* v*] ...) p) 1134 | ; (primcall ppr se* ...))) 1135 | ; 1136 | (define-language L17 1137 | (extends L16) 1138 | (terminals 1139 | (- (value-primitive (vpr)) 1140 | (effect-primitive (epr))) 1141 | (+ (non-alloc-value-primitive (vpr)) 1142 | (effect+internal-primitive (epr)) 1143 | (integer-64 (i)))) 1144 | (Value (v body) 1145 | (+ (alloc i se)))) 1146 | 1147 | ;;; Language L18: removes let forms and replaces them with a top-level locals 1148 | ;;; form that indicates which variables are bound in the function (so they 1149 | ;;; can be listed at the top of our C function) and set! that do simple 1150 | ;;; assignments. 1151 | ; 1152 | ; (define-language L18 1153 | ; (entry Program) 1154 | ; (terminals 1155 | ; (integer-64 (i)) 1156 | ; (effect+internal-primitive (epr)) 1157 | ; (non-alloc-value-primitive (vpr)) 1158 | ; (symbol (x l)) 1159 | ; (predicate-primitive (ppr)) 1160 | ; (constant (c))) 1161 | ; (Program (prog) 1162 | ; (labels ([l* le*] ...) l)) 1163 | ; (SimpleExpr (se) 1164 | ; x 1165 | ; (label l) 1166 | ; (quote c)) 1167 | ; (Value (v body) 1168 | ; (alloc i se) 1169 | ; se 1170 | ; (if p0 v1 v2) 1171 | ; (begin e* ... v) 1172 | ; (primcall vpr se* ...) 1173 | ; (se se* ...)) 1174 | ; (Effect (e) 1175 | ; (set! x v) 1176 | ; (nop) 1177 | ; (if p0 e1 e2) 1178 | ; (begin e* ... e) 1179 | ; (primcall epr se* ...) 1180 | ; (se se* ...)) 1181 | ; (Predicate (p) 1182 | ; (true) 1183 | ; (false) 1184 | ; (if p0 p1 p2) 1185 | ; (begin e* ... p) 1186 | ; (primcall ppr se* ...)) 1187 | ; (LocalsBody (lbody) 1188 | ; (locals (x* ...) body)) 1189 | ; (LambdaExpr (le) 1190 | ; (lambda (x* ...) lbody))) 1191 | ; 1192 | (define-language L18 1193 | (extends L17) 1194 | (Value (v body) 1195 | (- (let ([x* v*] ...) body))) 1196 | (Effect (e) 1197 | (- (let ([x* v*] ...) e)) 1198 | (+ (set! x v))) 1199 | (Predicate (p) 1200 | (- (let ([x* v*] ...) p))) 1201 | (LambdaExpr (le) 1202 | (- (lambda (x* ...) body)) 1203 | (+ (lambda (x* ...) lbody))) 1204 | (LocalsBody (lbody) 1205 | (+ (locals (x* ...) body)))) 1206 | 1207 | ;;; Language 19: simplify the right-hand-side of a set! so that it can 1208 | ;;; contain, simple expression, allocations, primcalls, and function calls, 1209 | ;;; but not ifs, or begins. 1210 | ; 1211 | ; (define-language L19 1212 | ; (terminals 1213 | ; (integer-64 (i)) 1214 | ; (effect+internal-primitive (epr)) 1215 | ; (non-alloc-value-primitive (vpr)) 1216 | ; (symbol (x l)) 1217 | ; (predicate-primitive (ppr)) 1218 | ; (constant (c))) 1219 | ; (Program (prog) 1220 | ; (labels ([l* le*] ...) l)) 1221 | ; (SimpleExpr (se) 1222 | ; x 1223 | ; (label l) 1224 | ; (quote c)) 1225 | ; (Value (v body) 1226 | ; rhs 1227 | ; (if p0 v1 v2) 1228 | ; (begin e* ... v)) 1229 | ; (Effect (e) 1230 | ; (set! x rhs) 1231 | ; (nop) 1232 | ; (if p0 e1 e2) 1233 | ; (begin e* ... e) 1234 | ; (primcall epr se* ...) 1235 | ; (se se* ...)) 1236 | ; (Predicate (p) 1237 | ; (true) 1238 | ; (false) 1239 | ; (if p0 p1 p2) 1240 | ; (begin e* ... p) 1241 | ; (primcall ppr se* ...)) 1242 | ; (LocalsBody (lbody) 1243 | ; (locals (x* ...) body)) 1244 | ; (LambdaExpr (le) 1245 | ; (lambda (x* ...) lbody)) 1246 | ; (Rhs (rhs) 1247 | ; se 1248 | ; (alloc i se) 1249 | ; (primcall vpr se* ...) 1250 | ; (se se* ...))) 1251 | ; 1252 | (define-language L19 1253 | (extends L18) 1254 | (Value (v body) 1255 | (- se 1256 | (alloc i se) 1257 | (primcall vpr se* ...) 1258 | (se se* ...)) 1259 | (+ rhs)) 1260 | (Rhs (rhs) 1261 | (+ se 1262 | (alloc i se) 1263 | (primcall vpr se* ...) => (vpr se* ...) 1264 | (se se* ...))) 1265 | (Effect (e) 1266 | (- (set! x v)) 1267 | (+ (set! x rhs)))) 1268 | 1269 | ;;; Language L20: remove begin from the predicate production (effectively 1270 | ;;; forcing the if to only have if, true, false, and predicate primitive 1271 | ;;; calls). 1272 | ;;; TODO: removed this language because our push-if pass was buggy, and 1273 | ;;; fixing it requires us to flatten code into something like 1274 | ;;; basic blocks, and we can avoid doing this since our target 1275 | ;;; is C. We could revisit it for other backend targets. 1276 | ; 1277 | ; (define-language L20 1278 | ; (terminals 1279 | ; (integer-64 (i)) 1280 | ; (effect+internal-primitive (epr)) 1281 | ; (non-alloc-value-primitive (vpr)) 1282 | ; (symbol (x l)) 1283 | ; (predicate-primitive (ppr)) 1284 | ; (constant (c))) 1285 | ; (Program (prog) 1286 | ; (labels ([l* le*] ...) l)) 1287 | ; (SimpleExpr (se) 1288 | ; x 1289 | ; (label l) 1290 | ; (quote c)) 1291 | ; (Value (v body) 1292 | ; rhs 1293 | ; (if p0 v1 v2) 1294 | ; (begin e* ... v)) 1295 | ; (Effect (e) 1296 | ; (set! x rhs) 1297 | ; (nop) 1298 | ; (if p0 e1 e2) 1299 | ; (begin e* ... e) 1300 | ; (primcall epr se* ...) 1301 | ; (se se* ...)) 1302 | ; (Predicate (p) 1303 | ; (true) 1304 | ; (false) 1305 | ; (if p0 p1 p2) 1306 | ; (primcall ppr se* ...)) 1307 | ; (LocalsBody (lbody) 1308 | ; (locals (x* ...) body)) 1309 | ; (LambdaExpr (le) 1310 | ; (lambda (x* ...) lbody)) 1311 | ; (Rhs (rhs) 1312 | ; se 1313 | ; (alloc i se) 1314 | ; (primcall vpr se* ...) 1315 | ; (se se* ...))) 1316 | ; 1317 | ; (define-language L20 1318 | ; (extends L19) 1319 | ; (Predicate (p) 1320 | ; (- (begin e* ... p)))) 1321 | 1322 | ;;; Language 21: remove quoted constants and replace it with our raw ptr 1323 | ;;; representation (i.e. 64-bit integers) 1324 | ; 1325 | ; (define-language L21 1326 | ; (terminals 1327 | ; (integer-64 (i)) 1328 | ; (effect+internal-primitive (epr)) 1329 | ; (non-alloc-value-primitive (vpr)) 1330 | ; (symbol (x l)) 1331 | ; (predicate-primitive (ppr))) 1332 | ; (Program (prog) 1333 | ; (labels ([l* le*] ...) l)) 1334 | ; (SimpleExpr (se) 1335 | ; i 1336 | ; x 1337 | ; (label l)) 1338 | ; (Value (v body) 1339 | ; rhs 1340 | ; (if p0 v1 v2) 1341 | ; (begin e* ... v)) 1342 | ; (Effect (e) 1343 | ; (set! x rhs) 1344 | ; (nop) 1345 | ; (if p0 e1 e2) 1346 | ; (begin e* ... e) 1347 | ; (primcall epr se* ...) 1348 | ; (se se* ...)) 1349 | ; (Predicate (p) 1350 | ; (true) 1351 | ; (false) 1352 | ; (if p0 p1 p2) 1353 | ; (primcall ppr se* ...)) 1354 | ; (LocalsBody (lbody) 1355 | ; (locals (x* ...) body)) 1356 | ; (LambdaExpr (le) 1357 | ; (lambda (x* ...) lbody)) 1358 | ; (Rhs (rhs) 1359 | ; se 1360 | ; (alloc i se) 1361 | ; (primcall vpr se* ...) 1362 | ; (se se* ...))) 1363 | ; 1364 | (define-language L21 1365 | (extends L19) 1366 | (terminals 1367 | (- (constant (c)))) 1368 | (SimpleExpr (se) 1369 | (- (quote c)) 1370 | (+ i))) 1371 | 1372 | ;;; Language 22: remove the primcalls and replace them with mref (memory 1373 | ;;; references), add, subtract, multiply, divide, shift-right, shift-left, 1374 | ;;; logand, mset! (memory set), =, <, and <=. 1375 | ;;; 1376 | ;;; TODO: we should probably replace this with "machine" instructions 1377 | ;;; instead, so that we can more easily extend the language and generate C 1378 | ;;; code from it. 1379 | ; 1380 | ; (define-language L22 1381 | ; (terminals 1382 | ; (integer-64 (i)) 1383 | ; (effect+internal-primitive (epr)) 1384 | ; (non-alloc-value-primitive (vpr)) 1385 | ; (symbol (x l)) 1386 | ; (predicate-primitive (ppr))) 1387 | ; (Program (prog) 1388 | ; (labels ([l* le*] ...) l)) 1389 | ; (SimpleExpr (se) 1390 | ; (logand se0 se1) 1391 | ; (shift-left se0 se1) 1392 | ; (shift-right se0 se1) 1393 | ; (divide se0 se1) 1394 | ; (multiply se0 se1) 1395 | ; (subtract se0 se1) 1396 | ; (add se0 se1) 1397 | ; (mref se0 (maybe se1?) i) 1398 | ; i 1399 | ; x 1400 | ; (label l)) 1401 | ; (Value (v body) 1402 | ; rhs 1403 | ; (if p0 v1 v2) 1404 | ; (begin e* ... v)) 1405 | ; (Effect (e) 1406 | ; (mset! se0 (maybe se1?) i se2) 1407 | ; (set! x rhs) 1408 | ; (nop) 1409 | ; (if p0 e1 e2) 1410 | ; (begin e* ... e) 1411 | ; (se se* ...)) 1412 | ; (Predicate (p) 1413 | ; (<= se0 se1) 1414 | ; (< se0 se1) 1415 | ; (= se0 se1) 1416 | ; (true) 1417 | ; (false) 1418 | ; (if p0 p1 p2)) 1419 | ; (LocalsBody (lbody) 1420 | ; (locals (x* ...) body)) 1421 | ; (LambdaExpr (le) 1422 | ; (lambda (x* ...) lbody)) 1423 | ; (Rhs (rhs) 1424 | ; se 1425 | ; (alloc i se) 1426 | ; (se se* ...))) 1427 | ; 1428 | (define-language L22 1429 | (extends L21) 1430 | (Rhs (rhs) 1431 | (- (primcall vpr se* ...))) 1432 | (SimpleExpr (se) 1433 | (+ (mref se0 (maybe se1?) i) 1434 | (add se0 se1) 1435 | (subtract se0 se1) 1436 | (multiply se0 se1) 1437 | (divide se0 se1) 1438 | (shift-right se0 se1) 1439 | (shift-left se0 se1) 1440 | (logand se0 se1))) 1441 | (Effect (e) 1442 | (- (primcall epr se* ...)) 1443 | (+ (mset! se0 (maybe se1?) i se2))) 1444 | (Predicate (p) 1445 | (- (primcall ppr se* ...)) 1446 | (+ (= se0 se1) 1447 | (< se0 se1) 1448 | (<= se0 se1)))) 1449 | 1450 | ;;;;;;;;; 1451 | ;;; beginning of our pass listings 1452 | 1453 | ;;; pass: parse-and-rename : S-expression -> Lsrc (or error) 1454 | ;;; 1455 | ;;; parses an S-expression, and, if it conforms to the input language, 1456 | ;;; renames the local variables to be represented with a unique variable. 1457 | ;;; This helps us to separate keywords from varialbes and recognize one 1458 | ;;; variable binding as different from another. This step is also called 1459 | ;;; alpha-renaming or alpha-conversion. The output will be in the Lsrc 1460 | ;;; language forms, represented as records. 1461 | ;;; 1462 | ;;; Some design decisions here: We could have decided to have this pass 1463 | ;;; remove one-armed ifs, remove and, or, and not, setup begins in the body 1464 | ;;; of our letrec, let, and lambda, and potentially quoted constants and 1465 | ;;; eta-expanded raw primitives, rather than doing each of these as separate 1466 | ;;; passes. I have not done this here, primarily for educational reasons, 1467 | ;;; since these simple passes are a gentle introduction to how the passes are 1468 | ;;; written. 1469 | ;;; 1470 | (define-pass parse-and-rename : * (e) -> Lsrc () 1471 | ;;; Helper functions for this pass. 1472 | (definitions 1473 | ;;; process-body - used to process the body of begin, let, letrec, and 1474 | ;;; lambda expressions. since all four of these have the same pattern in 1475 | ;;; the body. 1476 | (define process-body 1477 | (lambda (who env body* f) 1478 | (when (null? body*) (error who "invalid empty body")) 1479 | (let loop ([body (car body*)] [body* (cdr body*)] [rbody* '()]) 1480 | (if (null? body*) 1481 | (f (reverse rbody*) (Expr body env)) 1482 | (loop (car body*) (cdr body*) 1483 | (cons (Expr body env) rbody*)))))) 1484 | ;;; vars-unique? - processes the list of bindings to make sure all of the 1485 | ;;; variable names are different (i.e. we don't want to allow 1486 | ;;; (lambda (x x) x), since we would not know which x is which). 1487 | (define vars-unique? 1488 | (lambda (fmls) 1489 | (let loop ([fmls fmls]) 1490 | (or (null? fmls) 1491 | (and (not (memq (car fmls) (cdr fmls))) 1492 | (loop (cdr fmls))))))) 1493 | ;;; unique-vars - builds a list of unique variables based on a set of 1494 | ;;; formals and extends the environment. it takes a function as an 1495 | ;;; argument (effectively a continuation), and passes it the updated 1496 | ;;; environment and a list of unique variables. 1497 | (define unique-vars 1498 | (lambda (env fmls f) 1499 | (unless (vars-unique? fmls) 1500 | (error 'unique-vars "invalid formals ~a" fmls)) 1501 | (let loop ([fmls fmls] [env env] [rufmls '()]) 1502 | (if (null? fmls) 1503 | (f env (reverse rufmls)) 1504 | (let* ([fml (car fmls)] [ufml (unique-var fml)]) 1505 | (loop (cdr fmls) (cons (cons fml ufml) env) 1506 | (cons ufml rufmls))))))) 1507 | ;;; process-bindings - processes the bindings of a let or letrec and 1508 | ;;; produces bindings for unique variables for each of the original 1509 | ;;; variables. it also processes the right-hand sides of the variable 1510 | ;;; bindings and selects either the original environment (for let) or the 1511 | ;;; updated environment (for letrec). 1512 | (define process-bindings 1513 | (lambda (rec? env bindings f) 1514 | (let loop ([bindings bindings] [rfml* '()] [re* '()]) 1515 | (if (null? bindings) 1516 | (unique-vars env rfml* 1517 | (lambda (new-env rufml*) 1518 | (let ([env (if rec? new-env env)]) 1519 | (let loop ([rufml* rufml*] 1520 | [re* re*] 1521 | [ufml* '()] 1522 | [e* '()]) 1523 | (if (null? rufml*) 1524 | (f new-env ufml* e*) 1525 | (loop (cdr rufml*) (cdr re*) 1526 | (cons (car rufml*) ufml*) 1527 | (cons (Expr (car re*) env) e*))))))) 1528 | (let ([binding (car bindings)]) 1529 | (loop (cdr bindings) (cons (car binding) rfml*) 1530 | (cons (cadr binding) re*))))))) 1531 | ;;; Expr* - helper to process a list of expressions. 1532 | (define Expr* 1533 | (lambda (e* env) 1534 | (map (lambda (e) (Expr e env)) e*))) 1535 | ;;; with-output-language rebinds quasiquote so that it will build 1536 | ;;; language records. 1537 | (with-output-language (Lsrc Expr) 1538 | ;;; build-primitive - this is a helper function to build entries in the 1539 | ;;; initial environment for our user primitives. the initial 1540 | ;;; enviornment contains a mapping of keywords and primitives to 1541 | ;;; processing functions that check their arity (in the case of 1542 | ;;; primitives) or their forms (in the case of keywords). These are 1543 | ;;; put into an environment, because keywords and primitives can be 1544 | ;;; rebound. (i.e. (lambda (lambda) (lambda lambda)) is a perfectly 1545 | ;;; valid function in Scheme that takes a function as an argument and 1546 | ;;; applies the argument to itself). 1547 | (define build-primitive 1548 | (lambda (as) 1549 | (let ([name (car as)] [argc (cdr as)]) 1550 | (cons name 1551 | (if (< argc 0) 1552 | (error who 1553 | "primitives with arbitrary counts are not currently supported ~a" 1554 | name) 1555 | ;;; we'd love to support arbitrary argument lists, but we'd 1556 | ;;; need to either: 1557 | ;;; 1. get rid of raw primitives, or 1558 | ;;; 2. add function versions of our raw primitives with 1559 | ;;; arbitrary arguments, or (possibly and) 1560 | ;;; 3. add general handling for functions with arbitrary 1561 | ;;; arguments. (i.e. support for (lambda args ) 1562 | ;;; or (lambda (x y . args) ), which we don't 1563 | ;;; currently support. 1564 | #;(let ([argc (bitwise-not argc)]) 1565 | (lambda (env . e*) 1566 | (if (>= (length e*) argc) 1567 | `(,name ,(Expr* e* env) ...) 1568 | (error name "invalid argument count ~a" 1569 | (cons name e*))))) 1570 | (lambda (env . e*) 1571 | (if (= (length e*) argc) 1572 | `(,name ,(Expr* e* env) ...) 1573 | (error name "invalid argument count ~a" 1574 | (cons name e*))))))))) 1575 | ;;; initial-env - this is our initial environment, expressed as an 1576 | ;;; association list of keywords and primitives (represented as 1577 | ;;; symbols) to procedure handlers (represented as procedures). As the 1578 | ;;; program is processed through this pass, it will be extended with 1579 | ;;; local bidings from variables (represented as symbols) to unique 1580 | ;;; variables (represented as symbols with a format of symbol.number). 1581 | (define initial-env 1582 | (list* 1583 | (cons 'quote (lambda (env d) 1584 | (unless (datum? d) 1585 | (error 'quote "invalid datum ~a" d)) 1586 | `(quote ,d))) 1587 | (cons 'if (case-lambda 1588 | [(env e0 e1) `(if ,(Expr e0 env) ,(Expr e1 env))] 1589 | [(env e0 e1 e2) 1590 | `(if ,(Expr e0 env) ,(Expr e1 env) ,(Expr e2 env))] 1591 | [x (error 'if (if (< (length x) 3) 1592 | "too few arguments ~a" 1593 | "too many arguments ~a") 1594 | x)])) 1595 | (cons 'or (lambda (env . e*) `(or ,(Expr* e* env) ...))) 1596 | (cons 'and (lambda (env . e*) `(and ,(Expr* e* env) ...))) 1597 | (cons 'not (lambda (env e) `(not ,(Expr e env)))) 1598 | (cons 'begin (lambda (env . e*) 1599 | (process-body 'begin env e* 1600 | (lambda (e* e) 1601 | `(begin ,e* ... ,e))))) 1602 | (cons 'lambda (lambda (env fmls . body*) 1603 | (unique-vars env fmls 1604 | (lambda (env fmls) 1605 | (process-body 'lambda env body* 1606 | (lambda (body* body) 1607 | `(lambda (,fmls ...) 1608 | ,body* ... ,body))))))) 1609 | (cons 'let (lambda (env bindings . body*) 1610 | (process-bindings #f env bindings 1611 | (lambda (env x* e*) 1612 | (process-body 'let env body* 1613 | (lambda (body* body) 1614 | `(let ([,x* ,e*] ...) ,body* ... ,body))))))) 1615 | (cons 'letrec (lambda (env bindings . body*) 1616 | (process-bindings #t env bindings 1617 | (lambda (env x* e*) 1618 | (process-body 'letrec env body* 1619 | (lambda (body* body) 1620 | `(letrec ([,x* ,e*] ...) 1621 | ,body* ... ,body))))))) 1622 | (cons 'set! (lambda (env x e) 1623 | (cond 1624 | [(assq x env) => 1625 | (lambda (as) 1626 | (let ([v (cdr as)]) 1627 | (if (symbol? v) 1628 | `(set! ,v ,(Expr e env)) 1629 | (error 'set! "invalid syntax ~a" 1630 | (list 'set! x e)))))] 1631 | [else (error 'set! "set to unbound variable ~a" 1632 | (list 'set! x e))]))) 1633 | (map build-primitive user-prims))) 1634 | ;;; App - helper for handling applications. 1635 | (define App 1636 | (lambda (e env) 1637 | (let ([e (car e)] [e* (cdr e)]) 1638 | `(,(Expr e env) ,(Expr* e* env) ...)))))) 1639 | ;;; transformer: Expr: S-expression -> LSrc:Expr (or error) 1640 | ;;; 1641 | ;;; parses an S-expression, looking for a pair (which indicates, a 1642 | ;;; keyword use, a primitive call, or a normal function call), a symbol 1643 | ;;; (which indicates a variable reference or a primitive reference), or one of 1644 | ;;; our constants (which indicates a raw constant). 1645 | (Expr : * (e env) -> Expr () 1646 | (cond 1647 | [(pair? e) 1648 | (cond 1649 | [(assq (car e) env) => 1650 | (lambda (as) 1651 | (let ([v (cdr as)]) 1652 | (if (procedure? v) 1653 | (apply v env (cdr e)) 1654 | (App e env))))] 1655 | [else (App e env)])] 1656 | [(symbol? e) 1657 | (cond 1658 | [(assq e env) => 1659 | (lambda (as) 1660 | (let ([v (cdr as)]) 1661 | (cond 1662 | [(symbol? v) v] 1663 | [(primitive? e) e] 1664 | [else (error who "invalid syntax ~a" e)])))] 1665 | [else (error who "unbound variable ~a" e)])] 1666 | [(constant? e) e] 1667 | [else (error who "invalid expression ~a" e)])) 1668 | ;;; kick off processing the S-expression by handing Expr our initial 1669 | ;;; S-expression and the initial environment. 1670 | (Expr e initial-env)) 1671 | 1672 | ;;; pass: remove-one-armed-if : Lsrc -> L1 1673 | ;;; 1674 | ;;; this pass replaces the (if e0 e1) form with an if that will explicitly 1675 | ;;; produce a void value when the predicate expression returns false. In 1676 | ;;; other words: 1677 | ;;; (if e0 e1) => (if e0 e1 (void)) 1678 | ;;; 1679 | ;;; Design descision: kept seperate from parse-and-rename to make it easier 1680 | ;;; to understand how the nanopass framework can be used. 1681 | ;;; 1682 | (define-pass remove-one-armed-if : Lsrc (e) -> L1 () 1683 | (Expr : Expr (e) -> Expr () 1684 | [(if ,[e0] ,[e1]) `(if ,e0 ,e1 (void))])) 1685 | 1686 | ;;; pass: remove-and-or-not : L1 -> L2 1687 | ;;; 1688 | ;;; this pass looks for references to and, or, and not and replaces it with 1689 | ;;; the appropriate if expressions. this pass follows the standard 1690 | ;;; expansions and has one small optimization: 1691 | ;;; 1692 | ;;; (if (not e0) e1 e2) => (if e0 e2 e1) [optimization] 1693 | ;;; (and) => #t [from Scheme standard] 1694 | ;;; (or) => #f [from Scheme standard] 1695 | ;;; (and e e* ...) => (if e (and e* ...) #f) [standard expansion] 1696 | ;;; (or e e* ...) => (let ([t e]) [standard expansion - 1697 | ;;; (if t t (or e* ...))) avoids computing e twice] 1698 | ;;; 1699 | ;;; Design decision: again kept separate from parse-and-rename to simplify 1700 | ;;; the discussion of this pass (adding it to parse-and-rename doesn't really 1701 | ;;; make parse-and-rename much more complicated, and if we had a macro 1702 | ;;; system, which would likely be implemented in parse-and-rename, or before 1703 | ;;; it, we would probably want and, or, and not defined as macros, rather 1704 | ;;; than forms in the language, in which case this pass would be 1705 | ;;; unnecessary). 1706 | ;;; 1707 | (define-pass remove-and-or-not : L1 (e) -> L2 () 1708 | (Expr : Expr (e) -> Expr () 1709 | [(if (not ,[e0]) ,[e1] ,[e2]) `(if ,e0 ,e2 ,e1)] 1710 | [(not ,[e0]) `(if ,e0 #f #t)] 1711 | [(and) #t] 1712 | [(and ,[e] ,[e*] ...) 1713 | ;; tiny inline loop (not tail recursive, so called f instead of loop) 1714 | (let f ([e e] [e* e*]) 1715 | (if (null? e*) 1716 | e 1717 | `(if ,e ,(f (car e*) (cdr e*)) #f)))] 1718 | [(or) #f] 1719 | [(or ,[e] ,[e*] ...) 1720 | ;; tiny inline loop (not tail recursive, so called f instead of loop) 1721 | (let f ([e e] [e* e*]) 1722 | (if (null? e*) 1723 | e 1724 | (let ([t (make-tmp)]) 1725 | `(let ([,t ,e]) (if ,t ,t ,(f (car e*) (cdr e*)))))))])) 1726 | 1727 | ;;; pass: make-being-explicit : L2 -> L3 1728 | ;;; 1729 | ;;; this pass takes the L2 let, letrec, and lambda expressions (which have 1730 | ;;; bodies that can contain more than one expression), and converts them into 1731 | ;;; bodies with a single expression, wrapped in a begin if necessary. To 1732 | ;;; avoid polluting the output with extra begins that contain only one 1733 | ;;; expression the build-begin helper checks to see if there is more then one 1734 | ;;; expression and if there is builds a begin. 1735 | ;;; 1736 | ;;; Effectively this does the following: 1737 | ;;; (let ([x* e*] ...) body0 body* ... body1) => 1738 | ;;; (let ([x* e*] ...) (begin body0 body* ... body1)) 1739 | ;;; (letrec ([x* e*] ...) body0 body* ... body1) => 1740 | ;;; (letrec ([x* e*] ...) (begin body0 body* ... body1)) 1741 | ;;; (lambda (x* ...) body0 body* ... body1) => 1742 | ;;; (lambda (x* ...) (begin body0 body* ... body1)) 1743 | ;;; 1744 | ;;; Design Decision: This could have been included with rename-and-parse, 1745 | ;;; without making it significantly more compilicated, but was separated out 1746 | ;;; to continue with simpler nanopass passes to help make it more obvious 1747 | ;;; what is going on here. 1748 | ;;; 1749 | (define-pass make-begin-explicit : L2 (e) -> L3 () 1750 | (Expr : Expr (e) -> Expr () 1751 | ;;; Note: the defitions are within the body of the Expr transformer 1752 | ;;; instead of being within the body of the pass. This means the 1753 | ;;; quasiquote is bound to the Expr form, and we don't need to use 1754 | ;;; with-output-language. 1755 | (definitions 1756 | ;;; build-begin - helper function to build a begin only when the body 1757 | ;;; contains more then one expression. (this version of the helper 1758 | ;;; is a little over-kill, but it makes our traces look a little 1759 | ;;; cleaner. there should be a simpler way of doing this.) 1760 | (define build-begin 1761 | (lambda (e* e) 1762 | (nanopass-case (L3 Expr) e 1763 | [(begin ,e1* ... ,e) 1764 | (build-begin (append e* e1*) e)] 1765 | [else 1766 | (if (null? e*) 1767 | e 1768 | (let loop ([e* e*] [re* '()]) 1769 | (if (null? e*) 1770 | `(begin ,(reverse re*) ... ,e) 1771 | (let ([e (car e*)]) 1772 | (nanopass-case (L3 Expr) e 1773 | [(begin ,e0* ... ,e0) 1774 | (loop (append e0* (cons e0 (cdr e*))) re*)] 1775 | [else (loop (cdr e*) (cons (car e*) re*))])))))])))) 1776 | [(let ([,x* ,[e*]] ...) ,[body*] ... ,[body]) 1777 | `(let ([,x* ,e*] ...) ,(build-begin body* body))] 1778 | [(letrec ([,x* ,[e*]] ...) ,[body*] ... ,[body]) 1779 | `(letrec ([,x* ,e*] ...) ,(build-begin body* body))] 1780 | [(lambda (,x* ...) ,[body*] ... ,[body]) 1781 | `(lambda (,x* ...) ,(build-begin body* body))])) 1782 | 1783 | ;;; pass : inverse-eta-raw-primitives : L3 -> L4 1784 | ;;; 1785 | ;;; Eta reduction recognizes a function that takes a set of arguments and 1786 | ;;; passes those arguments directly to another function, and unwraps the 1787 | ;;; function. For instance, the function: 1788 | ;;; (lambda (x y) (f x y)) 1789 | ;;; can be eta reduced to: 1790 | ;;; f 1791 | ;;; Eta reduction is not always a semantics preserving transformation because 1792 | ;;; it can change the termination properties of the program, for instance a 1793 | ;;; program that terminates, could turn into one that does not because a 1794 | ;;; function is applied directly, rather than a function that might never be 1795 | ;;; applied. 1796 | ;;; 1797 | ;;; In this pass, we are applying the inverse operation and adding a lambda 1798 | ;;; wrapper when we see a primitive. We do this so that primitives, which we 1799 | ;;; are going to open code into a C-code equivalent, can still be treated as 1800 | ;;; though it was a Scheme procedure. This allows us to map over primitives, 1801 | ;;; which would otherwise not be possible with our code generation. Our 1802 | ;;; transformation looks for primitives in call position, marking them as 1803 | ;;; primitive calls, and primitives not in call position are eta-expanded to move 1804 | ;;; them into call position. 1805 | ;;; 1806 | ;;; (pr e* ...) => (primcall pr e* ...) 1807 | ;;; pr => (lambda (x* ...) (primcall pr x* ...)) 1808 | ;;; 1809 | ;;; Design decision: Another way to handle this would be to create a single 1810 | ;;; function for each primitive, and lift these definitions to the top-level 1811 | ;;; of the program, including just those primitives that are used. This 1812 | ;;; would avoid the potential to re-creating the same procedure over and over 1813 | ;;; again, as we are now. 1814 | ;;; 1815 | (define-pass inverse-eta-raw-primitives : L3 (e) -> L4 () 1816 | (Expr : Expr (e) -> Expr () 1817 | [(,pr ,[e*] ...) `(primcall ,pr ,e* ...)] 1818 | [,pr (cond 1819 | [(assq pr void+user-prims) => 1820 | (lambda (as) 1821 | (do ((i (cdr as) (fx- i 1)) 1822 | (x* '() (cons (make-tmp) x*))) 1823 | ((fx= i 0) `(lambda (,x* ...) (primcall ,pr ,x* ...)))))] 1824 | [else (error who "unexpected primitive ~a" pr)])])) 1825 | 1826 | ;;; pass: quote-constants : L4 -> L5 1827 | ;;; 1828 | ;;; A simple pass to find raw constants and wrap them in a quote. 1829 | ;;; c => (quote c) 1830 | ;;; 1831 | ;;; Design decision: This could simply be included in the next pass. 1832 | ;;; 1833 | (define-pass quote-constants : L4 (e) -> L5 () 1834 | (Expr : Expr (e) -> Expr () 1835 | [,c `(quote ,c)])) 1836 | 1837 | ;;; pass: remove-complex-constants : L5 -> L6 1838 | ;;; 1839 | ;;; Lifts creation of constants composed of vectors or pairs outside the body 1840 | ;;; of the program and makes their creation explicit. In place of the 1841 | ;;; constants, a temporary variable reference is created. The total 1842 | ;;; transform looks something like the following: 1843 | ;;; 1844 | ;;; (letrec ([add-pair-parts (lambda (p) (+ (car p) (cdr p)))]) 1845 | ;;; (+ (add-pair-parts '(4 . 5)) (add-pair-parts '(6 .7)))) => 1846 | ;;; (let ([t0 (cons 4 5)] [t1 (cons 6 7)]) 1847 | ;;; (letrec ([add-pair-parts (lambda (p) (+ (car p) (cdr p)))]) 1848 | ;;; (+ (add-pair-parts t0) (add-pair-parts t1)))) 1849 | ;;; 1850 | ;;; Design decision: Another possibility is to simply convert the constants 1851 | ;;; into their memory-layed out variations, rather than treating it in pieces 1852 | ;;; like this. We could extend our C run-time support to know about these 1853 | ;;; pre-layed out items so that we do not need to construct them when the 1854 | ;;; program starts running. 1855 | ;;; 1856 | (define-pass remove-complex-constants : L5 (e) -> L6 () 1857 | (definitions 1858 | ;;; t* and e* used to gather up our final constant bindings (via set!) 1859 | (define t* '()) 1860 | (define e* '())) 1861 | (Expr : Expr (e) -> Expr () 1862 | (definitions 1863 | ;;; datum->expr - a helper function for recurring through the parts of 1864 | ;;; a vector or pair datum to construct its parts, until it reaches the 1865 | ;;; constants in the leaves of the datum. We put this definition 1866 | ;;; within the Expr transformer so that quasiquote will be bound to the 1867 | ;;; L6:Expr nonterminal creation code. 1868 | (define datum->expr 1869 | (lambda (x) 1870 | (cond 1871 | [(pair? x) ;; if we have a pair, cons its recurred parts. 1872 | `(primcall cons ,(datum->expr (car x)) ,(datum->expr (cdr x)))] 1873 | [(vector? x) ;; if we have a vector ... 1874 | (let ([l (vector-length x)] [t (make-tmp)]) 1875 | ;; 1. create a vector of the proper size 1876 | `(let ([,t (primcall make-vector (quote ,l))]) 1877 | (begin 1878 | ;; 2. set each elemenet in the vector with its recurred 1879 | ;; parts. 1880 | ,(let loop ([l l] [e* '()]) 1881 | (if (fx= l 0) 1882 | e* 1883 | (let ([l (fx- l 1)]) 1884 | (loop l 1885 | (cons 1886 | `(primcall vector-set! ,t 1887 | (quote ,l) 1888 | ,(datum->expr (vector-ref x l))) 1889 | e*))))) 1890 | ... 1891 | ;; and return the vector as the final expression 1892 | ,t)))] 1893 | ;; if it is a constant, simply quote it. 1894 | [(constant? x) `(quote ,x)])))) 1895 | [(quote ,d) ;; look for quoted constants 1896 | (if (constant? d) ;; if they are already simple 1897 | `(quote ,d) ;; quote them 1898 | (let ([t (make-tmp)]) ;; otherwise create a binding for them 1899 | (set! t* (cons t t*)) 1900 | (set! e* (cons (datum->expr d) e*)) 1901 | t))]) 1902 | ;; in the body, call the Expr transformer, and if t* is null (indicating we 1903 | ;; did not find any complex constants) don't bother creating the empty let 1904 | ;; around it. 1905 | (let ([e (Expr e)]) 1906 | (if (null? t*) 1907 | e 1908 | `(let ([,t* ,e*] ...) ,e)))) 1909 | 1910 | ;;; pass: identify-assigned-variables : L6 -> L7 1911 | ;;; 1912 | ;;; This pass identifies which variables are assigned using set!. This is the 1913 | ;;; first step in a process known as assignment conversion. We separate 1914 | ;;; assigned varaibles from unassigned variables, and assigned variables are 1915 | ;;; converted into reference cells that can be manipulated through 1916 | ;;; primitives. In this compiler, we use the existing box type to create the 1917 | ;;; cells (using the box primitive), reference the cells (using the unbox 1918 | ;;; primitive), and mutating the cells (using the set-box! primitive). One 1919 | ;;; of the reasons we perform assignment conversion is it allows multiple 1920 | ;;; closures to capture the same mutable variable and all of the closures 1921 | ;;; will see the same, up-to-date, value for that variable since they all 1922 | ;;; simply contain a pointer to the reference cell. If we didn't do this 1923 | ;;; conversion, we would need to figure out a different way to handle set! so 1924 | ;;; that the updates are propagated to all the closures that close over the 1925 | ;;; variable. The eventual effect of assignemnt conversion is the following: 1926 | ;;; (let ([x 5]) 1927 | ;;; (set! x (+ x 5)) 1928 | ;;; (+ x x)) => 1929 | ;;; (let ([t0 5]) 1930 | ;;; (let ([x (box t0)]) 1931 | ;;; (primcall set-box! x (+ (unbox x) 5) 1932 | ;;; (+ (unbox x) (unbox x)))) 1933 | ;;; (of course in this example, we could have simply shadowed x) 1934 | ;;; 1935 | ;;; This pass, however, is simply an analysis pass. It gathers up the set of 1936 | ;;; assigned variables and deposits them in an AssignedBody just inside their 1937 | ;;; binding points. The transformation in this pass is: 1938 | ;;; 1939 | ;;; (let ([x 5] [y 7] [z 10]) 1940 | ;;; (set! x (+ x y)) 1941 | ;;; (+ x z)) => 1942 | ;;; (let ([x 5] [y 7] [z 10]) 1943 | ;;; (assigned (x) 1944 | ;;; (set! x (+ x y)) 1945 | ;;; (+ x z))) 1946 | ;;; 1947 | ;;; The key operations we depend on are: 1948 | ;;; set-cons - to extend a set with a newly found assigned variable. 1949 | ;;; intersect - to determine which assigned variables are bound by a lambda, 1950 | ;;; let, or letrec. 1951 | ;;; difference - to remove assigned variables from a set once we locate their 1952 | ;;; binding form. 1953 | ;;; union - to gather assigned variables from sub-expressions into a 1954 | ;;; single set. 1955 | ;;; 1956 | ;;; Note: we are using a relatively inefficient representation of sets here, 1957 | ;;; simply representing them as lists and using our set-cons, intersect, 1958 | ;;; difference, and union procedures to maintain their set-ness. We could 1959 | ;;; choose a more efficient set representation, perhaps leveraging insertion 1960 | ;;; sort or something similar, or we could choose to represent our variables 1961 | ;;; using a mutable record, with a field to indicate if it is assigned. 1962 | ;;; Either approach will improve the worst case performance of this pass, 1963 | ;;; though the mutable record version will get us down to a linear cost, 1964 | ;;; which is the best case for any pass in the current version of the 1965 | ;;; nanopass framework. 1966 | ;;; 1967 | (define-pass identify-assigned-variables : L6 (e) -> L7 () 1968 | (Expr : Expr (e) -> Expr ('()) 1969 | ;; identify an assigned variable 1970 | [(set! ,x ,[e assigned*]) (values `(set! ,x ,e) (set-cons x assigned*))] 1971 | ;; deposit assigned variables at lambda, let, and letrec binding sites 1972 | [(lambda (,x* ...) ,[body assigned*]) 1973 | (values 1974 | `(lambda (,x* ...) (assigned (,(intersect x* assigned*) ...) ,body)) 1975 | (difference assigned* x*))] 1976 | [(let ([,x* ,[e* assigned**]] ...) ,[body assigned*]) 1977 | (values 1978 | `(let ([,x* ,e*] ...) 1979 | (assigned (,(intersect x* assigned*) ...) ,body)) 1980 | (apply union (difference assigned* x*) assigned**))] 1981 | [(letrec ([,x* ,[e* assigned**]] ...) ,[body assigned*]) 1982 | (let ([assigned* (apply union assigned* assigned**)]) 1983 | (values 1984 | `(letrec ([,x* ,e*] ...) 1985 | (assigned (,(intersect x* assigned*) ...) ,body)) 1986 | (difference assigned* x*)))] 1987 | ;; traverse forms with nested expressions to gather up the assignments 1988 | ;; from each sub-expression. this could be simplified if the nanopass 1989 | ;; framework had a way to automatically combine these. 1990 | [(primcall ,pr ,[e* assigned**] ...) 1991 | (values `(primcall ,pr ,e* ...) (apply union assigned**))] 1992 | [(if ,[e0 assigned0*] ,[e1 assigned1*] ,[e2 assigned2*]) 1993 | (values `(if ,e0 ,e1 ,e2) (union assigned0* assigned1* assigned2*))] 1994 | [(begin ,[e* assigned**] ... ,[e assigned*]) 1995 | (values `(begin ,e* ... ,e) (apply union assigned* assigned**))] 1996 | [(,[e assigned*] ,[e* assigned**] ...) 1997 | (values `(,e ,e* ...) (apply union assigned* assigned**))]) 1998 | ;; in the body, call 1999 | (let-values ([(e assigned*) (Expr e)]) 2000 | (unless (null? assigned*) 2001 | (error who "found one or more unbound variables ~a" assigned*)) 2002 | e)) 2003 | 2004 | ;;; pass: purify-letrec : L7 -> L8 2005 | ;;; 2006 | ;;; this pass looks for places where letrec is used to bind something other 2007 | ;;; than a lambda expression, or where a letrec bound variable is assigned 2008 | ;;; and moves these bindings into let bindings. when the pass is done all of 2009 | ;;; the letrecs in our program will be immutable and bind only lambda 2010 | ;;; expressions. For instance, the following example: 2011 | ;;; 2012 | ;;; (letrec ([f (lambda (g x) (g x))] 2013 | ;;; [a 5] 2014 | ;;; [b (+ 5 7)] 2015 | ;;; [g (lambda (h) (f h 5))] 2016 | ;;; [c (let ([x 10]) ((letrec ([zero? (lambda (n) (= n 0))] 2017 | ;;; [f (lambda (n) 2018 | ;;; (if (zero? n) 2019 | ;;; 1 2020 | ;;; (* n (f (- n 1)))))]) 2021 | ;;; f) 2022 | ;;; x))] 2023 | ;;; [m 10] 2024 | ;;; [z (lambda (x) x)]) 2025 | ;;; (set! z (lambda (x) (+ x x))) 2026 | ;;; (set! m (+ m m)) 2027 | ;;; (+ (+ (+ (f z a) (f z b)) (f z c)) (g z)))) 2028 | ;;; => 2029 | ;;; (let ([z (quote #f)] [m '#f] [c '#f]) 2030 | ;;; (let ([b (+ '5 '7)] [a '5]) 2031 | ;;; (letrec ([g (lambda (h) (f h '5))] 2032 | ;;; [f (lambda (g x) (g x))]) 2033 | ;;; (begin 2034 | ;;; (set! z (lambda (x) x)) 2035 | ;;; (set! m '10) 2036 | ;;; (set! c 2037 | ;;; (let ([x '10]) 2038 | ;;; ((letrec ([f (lambda (n) 2039 | ;;; (if (zero? n) 2040 | ;;; '1 2041 | ;;; (* n (f (- n '1)))))] 2042 | ;;; [zero? (lambda (n) (= n '0))]) 2043 | ;;; f) 2044 | ;;; x))) 2045 | ;;; (begin 2046 | ;;; (set! z (lambda (x) (+ x x))) 2047 | ;;; (set! m (+ m m)) 2048 | ;;; (+ (+ (+ (f z a) (f z b)) (f z c)) (g z))))))) 2049 | ;;; 2050 | ;;; The algorithm for doing this is fairly simple. We attempt to separate 2051 | ;;; the bindings into simple bindings, lambda bindings, and complex bindings. 2052 | ;;; Simple bindings bind a constant, a variable reference not bound in this 2053 | ;;; letrec, the call to an effect free primitive, a begin that contains only 2054 | ;;; simple expressions, or an if that contains only simple expressions to an 2055 | ;;; immutable variable. The simple? predicate is used for determining when an 2056 | ;;; expression is simple. A lambda expression is simply a lambda, and a 2057 | ;;; complex expression is any other expression. 2058 | ;;; 2059 | ;;; Design decision: There are many other approaches that we could use, 2060 | ;;; including those described in the "Fixing Letrec: A Faithful Yet Efficient 2061 | ;;; Implementation of Scheme’s Recursive Binding Construct" by Waddell, et. 2062 | ;;; al. and "Fixing Letrec (reloaded)" by Ghuloum and Dybvig. Earlier 2063 | ;;; versions of Chez Scheme used the earlier paper, which documented how to 2064 | ;;; properly handle R5RS letrecs, and newer versions use the latter paper 2065 | ;;; which described how to properly handle R6RS letrec and letrec*. 2066 | ;;; 2067 | (define-pass purify-letrec : L7 (e) -> L8 () 2068 | (definitions 2069 | ;; lambda? - use nanopass case to determine if an L8:Expr is a lambda 2070 | ;; expression. 2071 | (define lambda? 2072 | (lambda (e) 2073 | (nanopass-case (L8 Expr) e 2074 | [(lambda (,x* ...) ,abody) #t] 2075 | [else #f]))) 2076 | ;; simple? - use nanopass case to deteremin if an L8:Expr is a "simple", 2077 | ;; effect free expression. 2078 | (define simple? 2079 | (lambda (x bound* assigned*) 2080 | (let f ([x x]) 2081 | (nanopass-case (L8 Expr) x 2082 | [(quote ,c) #t] 2083 | [,x (not (or (memq x bound*) (memq x assigned*)))] 2084 | [(primcall ,pr ,e* ...) 2085 | (and (effect-free-prim? pr) (andmap f e*))] 2086 | [(begin ,e* ... ,e) (and (andmap f e*) (f e))] 2087 | [(if ,e0 ,e1 ,e2) (and (f e0) (f e1) (f e2))] 2088 | [else #f]))))) 2089 | (Expr : Expr (e) -> Expr () 2090 | (definitions 2091 | ;; build a let, when there are bindings, otherwise, just return the 2092 | ;; body. 2093 | (define build-let 2094 | (lambda (x* e* a* body) 2095 | (if (null? x*) 2096 | body 2097 | `(let ([,x* ,e*] ...) (assigned (,a* ...) ,body))))) 2098 | ;; build a letrec, when there are bindings, otherwise, just return the 2099 | ;; body 2100 | (define build-letrec 2101 | (lambda (x* e* body) 2102 | (if (null? x*) 2103 | body 2104 | `(letrec ([,x* ,e*] ...) ,body)))) 2105 | ;; build a begin when we have more then one expression, otherwise just 2106 | ;; return the one expression. 2107 | (define build-begin 2108 | (lambda (e* e) 2109 | (nanopass-case (L8 Expr) e 2110 | [(begin ,e1* ... ,e) 2111 | (build-begin (append e* e1*) e)] 2112 | [else 2113 | (if (null? e*) 2114 | e 2115 | (let loop ([e* e*] [re* '()]) 2116 | (if (null? e*) 2117 | `(begin ,(reverse re*) ... ,e) 2118 | (let ([e (car e*)]) 2119 | (nanopass-case (L8 Expr) e 2120 | [(begin ,e0* ... ,e0) 2121 | (loop (append e0* (cons e0 (cdr e*))) re*)] 2122 | [else (loop (cdr e*) (cons (car e*) re*))])))))])))) 2123 | [(letrec ([,x* ,[e*]] ...) (assigned (,a* ...) ,[body])) 2124 | ;; loop through our bindings, separating them into simple, lambda, and 2125 | ;; complex. 2126 | (let loop ([xb* x*] [e* e*] 2127 | [xs* '()] [es* '()] 2128 | [xl* '()] [el* '()] 2129 | [xc* '()] [ec* '()]) 2130 | (if (null? xb*) 2131 | ;; when we're done bind the complex bindings to #f, they are now 2132 | ;; all assigned, then bind the simple bindings, then create a 2133 | ;; letrec binding for the lambda expressions (eliminate the 2134 | ;; assigned body, since we know none of them are assigned), and 2135 | ;; finally use set! to set the values of our complex bindings. 2136 | (build-let xc* (make-list (length xc*) `(quote #f)) xc* 2137 | (build-let xs* es* '() 2138 | (build-letrec xl* el* 2139 | (build-begin 2140 | (map (lambda (xc ec) `(set! ,xc ,ec)) xc* ec*) 2141 | body)))) 2142 | (let ([x (car xb*)] [e (car e*)]) 2143 | (cond 2144 | [(and (not (memq x a*)) (lambda? e)) 2145 | (loop (cdr xb*) (cdr e*) xs* es* 2146 | (cons x xl*) (cons e el*) xc* ec*)] 2147 | [(and (not (memq x a*)) (simple? e x* a*)) 2148 | (loop (cdr xb*) (cdr e*) (cons x xs*) (cons e es*) 2149 | xl* el* xc* ec*)] 2150 | [else (loop (cdr xb*) (cdr e*) xs* es* xl* el* 2151 | (cons x xc*) (cons e ec*))]))))])) 2152 | 2153 | ;;; pass: optimize-direct-call : L8 -> L8 2154 | ;;; 2155 | ;;; one of our simplest optimizations, we convert a directly applied lambdas 2156 | ;;; into a let. this allows us to avoid the creation of a closure for the 2157 | ;;; let, and allows us instead to create a local binding within a function. 2158 | ;;; the transform is simple: 2159 | ;;; 2160 | ;;; ((lambda (x* ...) body) e* ...) => (let ([x* e*] ...) body) 2161 | ;;; where (length x*) == (length e*) 2162 | ;;; 2163 | (define-pass optimize-direct-call : L8 (e) -> L8 () 2164 | (Expr : Expr (e) -> Expr () 2165 | [((lambda (,x* ...) ,[abody]) ,[e* -> e*] ...) 2166 | ;(guard (fx=? (length x*) (length e*))) 2167 | (guard #t) 2168 | `(let ([,x* ,e*] ...) ,abody)])) 2169 | 2170 | ;;; pass: find-let-bound-lambdas : L8 -> L8 2171 | ;;; 2172 | ;;; this pass looks for places where let is used to bind a lambda expression 2173 | ;;; to an immutable variable and converts this binding into a letrec binding. 2174 | ;;; Part of the reason we can do this here, is because we have uniquely named 2175 | ;;; each of our variables and none of these variables can be referenced in 2176 | ;;; the right-hand side of the let bindings. If it was still possible for 2177 | ;;; variables to have same name, this would not be a legal transformation, 2178 | ;;; since it might cause a lambda that did not capture a variable bound in 2179 | ;;; this let to bind the variable in the resulting letrec. The 2180 | ;;; transformation looks like: 2181 | ;;; 2182 | ;;; (let ([x 5] [f (lambda (n) (+ n n))] [g (lambda (x) x)] [m 10]) 2183 | ;;; (assigned (g m) 2184 | ;;; body)) => 2185 | ;;; (let ([x 5] [g (lambda (x) x)] [m 10]) 2186 | ;;; (assigned (g m) 2187 | ;;; (letrec ([f (lambda (n) (+ n n))]) 2188 | ;;; body))) 2189 | ;;; 2190 | ;;; Design decisions: Handling of let can be incorporated into the handling 2191 | ;;; of letrec, either through one of the algorithms described in the design 2192 | ;;; decisions of the purify-letrec pass, or in the existing letrec pass. It 2193 | ;;; is kept separate here, largely to make the letrec pass more straight 2194 | ;;; forward to understand. 2195 | ;;; 2196 | (define-pass find-let-bound-lambdas : L8 (e) -> L8 () 2197 | (Expr : Expr (e) -> Expr () 2198 | (definitions 2199 | ;; build-let - constructs a let if any variables are bound by the let, 2200 | ;; or simply returns the body if there are no bindings. 2201 | (define build-let 2202 | (lambda (x* e* a* body) 2203 | (if (null? x*) 2204 | body 2205 | `(let ([,x* ,e*] ...) (assigned (,a* ...) ,body))))) 2206 | ;; build-letrec - constructs a letrec if any variables are bound by the 2207 | ;; letrec, or simple returns the body if there are no bindings. 2208 | (define build-letrec 2209 | (lambda (x* le* body) 2210 | (if (null? x*) 2211 | body 2212 | `(letrec ([,x* ,le*] ...) ,body))))) 2213 | [(let ([,x* ,[e*]] ...) (assigned (,a* ...) ,[body])) 2214 | ;; executes a similar algorithm to the purify-letrec pass, though it 2215 | ;; does not separate simple from complex bindings, since we currently 2216 | ;; handle both in the let. 2217 | (let loop ([xb* x*] [e* e*] [xl* '()] [el* '()] [xo* '()] [eo* '()]) 2218 | (if (null? xb*) 2219 | (build-let xo* eo* a* (build-letrec xl* el* body)) 2220 | (let ([x (car xb*)] [e (car e*)]) 2221 | (nanopass-case (L8 Expr) e 2222 | [(lambda (,x* ...) ,abody) 2223 | (guard (not (memq x e*))) 2224 | (loop (cdr xb*) (cdr e*) (cons x xl*) (cons e el*) xo* eo*)] 2225 | [else (loop (cdr xb*) (cdr e*) xl* el* 2226 | (cons x xo*) (cons e eo*))]))))])) 2227 | 2228 | ;;; pass: remove-anonymous-lambda : L8 -> L9 2229 | ;;; 2230 | ;;; since we are generating a C function for each Scheme lambda, we need to 2231 | ;;; have a name for each of these lambdas. In addition we need a name to use 2232 | ;;; as the code pointer label, so that we can lift the lambdas to the top 2233 | ;;; level of the program. The transformation is fairly simple. If we find a 2234 | ;;; lambda in expression position (i.e. not in the right-hand side of a 2235 | ;;; letrec binding) then we wrap a letrec around it that gives it a new name. 2236 | ;;; 2237 | ;;; (letrec ([l* (lambda (x** ...) body*)] ...) body) => (no change) 2238 | ;;; (letrec ([l* (lambda (x** ...) body*)] ...) body) 2239 | ;;; 2240 | ;;; (lambda (x* ...) body) => (letrec ([t0 (lambda (x* ...) body)]) t0) 2241 | ;;; 2242 | (define-pass remove-anonymous-lambda : L8 (e) -> L9 () 2243 | (Expr : Expr (e) -> Expr () 2244 | [(lambda (,x* ...) ,[abody]) 2245 | (let ([t (unique-var 'anon)]) 2246 | `(letrec ([,t (lambda (,x* ...) ,abody)]) ,t))])) 2247 | 2248 | ;;; pass: convert-assignments : L9 -> L10 2249 | ;;; 2250 | ;;; this pass completes the assignment conversion process that we started in 2251 | ;;; identify-assigned-variables. We use the assigned variable list through 2252 | ;;; our previous passes to make decisions about how bindings were separated. 2253 | ;;; Now, we are ready to change these explicitly to the box, unbox, and 2254 | ;;; set-box! primitive calls described in the identified-assigned-variable 2255 | ;;; pass. We also introduce new temporaries to contain the value that will 2256 | ;;; be put in the box. this is largely because we don't want our 2257 | ;;; representation of assigned variables to be observable from inside the 2258 | ;;; program, and if we were to introduce an operator like call/cc to our 2259 | ;;; implementation, then the order our variables were setup could potentially 2260 | ;;; be identified by seeing that the allocation and computation of the values 2261 | ;;; are intermixed. Instead, we want all the computation to happen, then the 2262 | ;;; allocation, and then the allocated locations are updated with the values. 2263 | ;;; 2264 | ;;; Our transform thus looks like the following: 2265 | ;;; 2266 | ;;; (let ([x0 e0] [x1 e1] ... [xa0 ea0] [xa1 xa0] ...) 2267 | ;;; (assigned (xa0 xa1 ...) 2268 | ;;; body)) 2269 | ;;; => 2270 | ;;; (let ([x0 e0] [x1 e1] ... [t0 ea0] [t1 ea1] ...) 2271 | ;;; (let ([xa0 (primcall box t0)] [xa1 (primcall box t1)] ...) 2272 | ;;; body^)) 2273 | ;;; 2274 | ;;; (lambda (x0 x1 ... xa0 xa1 ...) (assigned (xa0 xa1 ...) body)) 2275 | ;;; => 2276 | ;;; (lambda (x0 x1 ... t0 t1 ...) 2277 | ;;; (let ([xa0 (primcall box t0)] [xa1 (primcall box t1)] ...) 2278 | ;;; body^)) 2279 | ;;; 2280 | ;;; where 2281 | ;;; (set! xa0 e) => (primcall set-box! xa0 e^) 2282 | ;;; and 2283 | ;;; xa0 => (primcall unbox xa0) 2284 | ;;; in body^ and e^ 2285 | ;;; 2286 | ;;; We could choose another data structure, or even create a new data 2287 | ;;; structure to perform the conversion with, however, we've choosen the box 2288 | ;;; because it contains exactly one cell, and takes up just one word in 2289 | ;;; memory, where as our pair and vector take at least two words in memory. 2290 | ;;; This decision might be different if we had other constraints on how we 2291 | ;;; lay out memory. 2292 | ;;; 2293 | (define-pass convert-assignments : L9 (e) -> L10 () 2294 | (definitions 2295 | ;; lookup - looks for assigned variables in the environment and maps them 2296 | ;; to their temporaries. 2297 | (define lookup 2298 | (lambda (env) 2299 | (lambda (x) 2300 | (cond 2301 | [(assq x env) => cdr] 2302 | [else x])))) 2303 | ;; build-env - generates temporaries, extends the environment, and 2304 | ;; returns the final list of unassigned binding variables, the list of 2305 | ;; emporaries, and the updated environment, through the call to f 2306 | (define build-env 2307 | (lambda (x* a* env f) 2308 | (let ([t* (map (lambda (x) (make-tmp)) a*)]) 2309 | (let ([env (append (map cons a* t*) env)]) 2310 | (f (map (lookup env) x*) t* env))))) 2311 | (with-output-language (L10 Expr) 2312 | ;; make-boxes - build the calls to box to create the storage locations 2313 | ;; for our assigned variables. 2314 | (define make-boxes 2315 | (lambda (t*) 2316 | (map (lambda (t) `(primcall box ,t)) t*))) 2317 | ;; build-let - builds a let if there are any bindings, or returns the 2318 | ;; body if there are none. 2319 | (define build-let 2320 | (lambda (x* e* body) 2321 | (if (null? x*) 2322 | body 2323 | `(let ([,x* ,e*] ...) ,body)))))) 2324 | (Expr : Expr (e [env '()]) -> Expr () 2325 | [(let ([,x* ,[e*]] ...) (assigned (,a* ...) ,body)) 2326 | (build-env x* a* env 2327 | (lambda (x* t* env) 2328 | (build-let x* e* 2329 | (build-let a* (make-boxes t*) 2330 | (Expr body env)))))] 2331 | [,x (if (assq x env) `(primcall unbox ,x) x)] 2332 | [(set! ,x ,[e]) `(primcall set-box! ,x ,e)]) 2333 | (LambdaExpr : LambdaExpr (le env) -> LambdaExpr () 2334 | [(lambda (,x* ...) (assigned (,a* ...) ,body)) 2335 | (build-env x* a* env 2336 | (lambda (x* t* env) 2337 | `(lambda (,x* ...) 2338 | ,(build-let a* (make-boxes t*) (Expr body env)))))])) 2339 | 2340 | ;;; pass: uncover-free : L10 -> L11 2341 | ;;; 2342 | ;;; this pass performs the first step in closure conversion, determining the 2343 | ;;; set of free-variables for each lambda expression. this list of free 2344 | ;;; variables is an approximation of the values that need to be available to 2345 | ;;; a procedure as its captured environment when a procedure is executed. 2346 | ;;; there are numerous ways to shrink, or even eliminate this list, but in 2347 | ;;; this compiler we are currently skipping any of these steps, and simply 2348 | ;;; taking this set of free variables as the set we need to capture. (For 2349 | ;;; one possible closure optimization technique see "Optimizing Flat 2350 | ;;; Closures" by Keep et. al. or Chapter 5. of "A Nanopass Compiler for 2351 | ;;; Commercial Compiler Development" by Keep). This is an analysis pass, 2352 | ;;; so we are just gathering up the free variables. This will look somewhat 2353 | ;;; similar to the identify-assigned-variables, except we care about all 2354 | ;;; variable references, but only the free variables at lambdas. 2355 | ;;; 2356 | (define-pass uncover-free : L10 (e) -> L11 () 2357 | (Expr : Expr (e) -> Expr (free*) 2358 | ;; quoted constants have no variable references 2359 | [(quote ,c) (values `(quote ,c) '())] 2360 | ;; gather up variable references 2361 | [,x (values x (list x))] 2362 | ;; if we find a let or a letrec remove the bound variables from the list 2363 | ;; of references. 2364 | [(let ([,x* ,[e* free**]] ...) ,[e free*]) 2365 | (values `(let ([,x* ,e*] ...) ,e) 2366 | (apply union (difference free* x*) free**))] 2367 | [(letrec ([,x* ,[le* free**]] ...) ,[body free*]) 2368 | (values `(letrec ([,x* ,le*] ...) ,body) 2369 | (difference (apply union free* free**) x*))] 2370 | ;; in all the other cases, we simply want to gather up the 2371 | ;; variable references from each sub expression 2372 | [(if ,[e0 free0*] ,[e1 free1*] ,[e2 free2*]) 2373 | (values `(if ,e0 ,e1 ,e2) (union free0* free1* free2*))] 2374 | [(begin ,[e* free**] ... ,[e free*]) 2375 | (values `(begin ,e* ... ,e) (apply union free* free**))] 2376 | [(primcall ,pr ,[e* free**]...) 2377 | (values `(primcall ,pr ,e* ...) (apply union free**))] 2378 | [(,[e free*] ,[e* free**] ...) 2379 | (values `(,e ,e* ...) (apply union free* free**))]) 2380 | (LambdaExpr : LambdaExpr (le) -> LambdaExpr (free*) 2381 | ;; at the lambda expression, remove our bound variables, everything else 2382 | ;; is free. we continue to return the free variables until we find their 2383 | ;; binding forms. 2384 | [(lambda (,x* ...) ,[body free*]) 2385 | (let ([free* (difference free* x*)]) 2386 | (values `(lambda (,x* ...) (free (,free* ...) ,body)) free*))]) 2387 | ;; in the body, we kick off with the Expr call, and make sure that we have 2388 | ;; an empty free list when we reach the top, because we expect our programs 2389 | ;; to be self-contained with no free-references. 2390 | (let-values ([(e free*) (Expr e)]) 2391 | (unless (null? free*) (error who "found unbound variables ~a" free*)) 2392 | e)) 2393 | 2394 | ;;; pass: convert-closures : L11 -> L12 2395 | ;;; 2396 | ;;; this pass begins closure conversion, using the free variable lists 2397 | ;;; gathered in the previous pass to begin creating our closure data 2398 | ;;; structures. This pass splits letrec bindings into a 'closures' binding 2399 | ;;; form, which lists the bound variable, a label that will refer to the code 2400 | ;;; of the function (and will become the function name), and the list of free 2401 | ;;; variables that will be included in the final closure datastructure. The 2402 | ;;; second binding form is the labels form, which binds the label for a 2403 | ;;; procedure to the procedures code. We also add an explicit closure 2404 | ;;; pointer argument to each procedure. If we were compiling to assembly 2405 | ;;; code, we might avoid this and just specify a register to hold the closure 2406 | ;;; pointer. We can also eliminate the need for the closure pointer if we 2407 | ;;; use the correct optimizations. Finally, we add the explicit closure 2408 | ;;; argument to each procedure call site. 2409 | ;;; 2410 | ;;; These transformations look as follows: 2411 | ;;; 2412 | ;;; (letrec ([x* (lambda (x** ...) (free (f** ...) body*))] ...) body) => 2413 | ;;; (closures ([x* l* f** ...] ...) 2414 | ;;; (labels ([l* (lambda (cp* x** ...) (free (f** ...) body*))] ...) body)) 2415 | ;;; where l* is a list of labels for each lambda expression and cp* is a 2416 | ;;; list of variables representing an explicit closure argument 2417 | ;;; 2418 | ;;; (x e* ...) => (x x e* ...) ; a small optimization 2419 | ;;; (e e* ...) => (let ([t e]) (t t e* ...)) 2420 | ;;; 2421 | ;;; Design decision: We separate the steps of closure creation and explicit 2422 | ;;; allocation and setting of closure values, partially so that we can 2423 | ;;; implement closure optimization passes that can help reduce the number of 2424 | ;;; free variables, or even eliminate closures entirely, when we do not have 2425 | ;;; any free variables. 2426 | ;;; 2427 | (define-pass convert-closures : L11 (e) -> L12 () 2428 | (definitions 2429 | (define make-cp (lambda (x) (unique-var 'cp))) 2430 | (define make-label 2431 | (lambda (x) 2432 | (unique-var 2433 | (string->symbol 2434 | (string-append "l:" 2435 | (symbol->string (base-var x)))))))) 2436 | (Expr : Expr (e) -> Expr () 2437 | [(letrec ([,x* (lambda (,x** ...) (free (,f** ...) ,[body*]))] ...) 2438 | ,[body]) 2439 | (let ([l* (map make-label x*)] [cp* (map make-cp x*)]) 2440 | `(closures ([,x* ,l* ,f** ...] ...) 2441 | (labels ([,l* (lambda (,cp* ,x** ...) 2442 | (free (,f** ...) ,body*))] ...) 2443 | ,body)))] 2444 | [(,x ,[e*] ...) `(,x ,x ,e* ...)] 2445 | [(,[e] ,[e*] ...) 2446 | (let ([t (make-tmp)]) 2447 | `(let ([,t ,e]) (,t ,t ,e* ...)))])) 2448 | 2449 | ;;; pass: optimize-known-call : L12 -> L12 2450 | ;;; 2451 | ;;; a tiny "optimization" pass that recognizes when we know what procedure 2452 | ;;; is being called, and refers to the procedure directly, rather than 2453 | ;;; requiring that the procedure pointer be accessed through a dereference 2454 | ;;; of the closure pointer. This allows the procedure to be called as: 2455 | ;;; 2456 | ;;; func_name_10(...) 2457 | ;;; 2458 | ;;; instead of: 2459 | ;;; 2460 | ;;; ((ptr (*)(ptr, ...))*(func_closure_10 + closure-code-offset - closure-tag)(...) 2461 | ;;; 2462 | ;;; in addition to looking simpler, it also avoids indirect calls, which 2463 | ;;; means both that we can avoid an extra memory reference, and the C 2464 | ;;; compiler has a better opportunity to optimize the call, and the processor 2465 | ;;; can potentially handle the code faster (in addition avoiding the extra 2466 | ;;; memory reference). 2467 | ;;; 2468 | ;;; Design decision: Our approach to determining when a call is known is 2469 | ;;; pretty simple. When we pass a closure binding we add the binding of the 2470 | ;;; variable to the label to our environment, and if we encounter a call to 2471 | ;;; one of these variables, we replace it with a reference to the label. 2472 | ;;; This gives us good results, but it will not detect every known call that 2473 | ;;; we might be able to find if we used a more expensive analysis like 2474 | ;;; control-flow analysis. For our purposes, the linear-time optimization 2475 | ;;; is fast and simple, but if we want a more precise analysis, and we are 2476 | ;;; willing to pay the additional cost (slightly less than cubic for 0CFA or 2477 | ;;; exponential for 1CFA or higher), than we could perform a more precise 2478 | ;;; analysis here. 2479 | ;;; 2480 | (define-pass optimize-known-call : L12 (e) -> L12 () 2481 | (LabelsBody : LabelsBody (lbody env) -> LabelsBody ()) 2482 | (LambdaExpr : LambdaExpr (le env) -> LambdaExpr ()) 2483 | (FreeBody : FreeBody (fbody env) -> FreeBody ()) 2484 | (Expr : Expr (e [env '()]) -> Expr () 2485 | [(closures ([,x* ,l* ,f** ...] ...) ,lbody) 2486 | (let ([env (foldl 2487 | (lambda (x l env) (cons (cons x l) env)) 2488 | env x* l*)]) 2489 | (let ([lbody (LabelsBody lbody env)]) 2490 | `(closures ([,x* ,l* ,f** ...] ...) ,lbody)))] 2491 | [(,x ,[e*] ...) 2492 | (cond 2493 | [(assq x env) => (lambda (as) `((label ,(cdr as)) ,e* ...))] 2494 | [else `(,x ,e* ...)])])) 2495 | 2496 | ;;; pass: expose-closure-prims : L12 -> L13 2497 | ;;; 2498 | ;;; this pass finishes closure conversion by turning our closures form into a 2499 | ;;; let binding closure variables to explicit closure allocations (using the 2500 | ;;; added make-closure primitive) and explicit closure set!s to fill in the 2501 | ;;; code (with the closure-code-set! primitive) and free variable values of 2502 | ;;; the closure (with the closre-data-set! primitive). We do this as 2503 | ;;; separate creation and mutation steps, since we may have circular 2504 | ;;; datastructures, where we need to place the value of a closure allocated 2505 | ;;; in the let binding in a closure bound by the same let binding. We also 2506 | ;;; move the labels form into plae as an expression, discard the free 2507 | ;;; variable list form the body of our lambda expressions, and make explicit 2508 | ;;; references to the closure code slot (with the closure-code primitive) 2509 | ;;; where closures are called, and the closure data slots (with the 2510 | ;;; closure-ref primitive) where a free variable is referenced. 2511 | ;;; 2512 | ;;; The transform looks as follows: 2513 | ;;; (closures ([x* l* f** ...] ...) lbody) => 2514 | ;;; (let ([x* (primcall make-closure ---)] ...) 2515 | ;;; (begin 2516 | ;;; (primcall closure-code-set! x* l*) ... 2517 | ;;; (primcall closure-data-set! x* 0 (car f**)) 2518 | ;;; (primcall closure-data-set! x* 1 (cadr f**)) 2519 | ;;; ...)) 2520 | ;;; 2521 | ;;; (x e* ...) => ((closure-code x) e* ...) 2522 | ;;; x => (closure-ref cp idx) ; where x is a free variable, and 2523 | ;;; ; idx is the offset of the free 2524 | ;;; ; variable in the closure. 2525 | ;;; 2526 | ;;; 2527 | ;;; Design decision: We could also combine this with the lift-lambdas pass 2528 | ;;; and finish lifting (our now first-order) procedures to the top-level of 2529 | ;;; the program. It is reasonable to keep these separate, since their action 2530 | ;;; on the code is a little different, but they could also be combined 2531 | ;;; without much trouble. 2532 | ;;; 2533 | (define-pass expose-closure-prims : L12 (e) -> L13 () 2534 | (Expr : Expr (e [cp #f] [free* '()]) -> Expr () 2535 | (definitions 2536 | (define handle-closure-ref 2537 | (lambda (x cp free*) 2538 | (let loop ([free* free*] [i 0]) 2539 | (cond 2540 | [(null? free*) x] 2541 | [(eq? x (car free*)) `(primcall closure-ref ,cp (quote ,i))] 2542 | [else (loop (cdr free*) (fx+ i 1))])))) 2543 | (define build-closure-set* 2544 | (lambda (x* l* f** cp free*) 2545 | (foldl 2546 | (lambda (x l f* e*) 2547 | (let loop ([f* f*] [i 0] [e* e*]) 2548 | (if (null? f*) 2549 | (cons `(primcall closure-code-set! ,x (label ,l)) e*) 2550 | (loop (cdr f*) (fx+ i 1) 2551 | (cons `(primcall closure-data-set! ,x (quote ,i) 2552 | ,(handle-closure-ref (car f*) cp free*)) 2553 | e*))))) 2554 | '() 2555 | x* l* f**)))) 2556 | [(closures ([,x* ,l* ,f** ...] ...) 2557 | (labels ([,l2* ,[le*]] ...) ,[body])) 2558 | (let ([size* (map length f**)]) 2559 | `(let ([,x* (primcall make-closure (quote ,size*))] ...) 2560 | (labels ([,l2* ,le*] ...) 2561 | (begin 2562 | ,(build-closure-set* x* l* f** cp free*) ... 2563 | ,body))))] 2564 | [,x (handle-closure-ref x cp free*)] 2565 | [((label ,l) ,[e*] ...) `((label ,l) ,e* ...)] 2566 | [(,[e] ,[e*] ...) `((primcall closure-code ,e) ,e* ...)]) 2567 | (LabelsBody : LabelsBody (lbody) -> Expr ()) 2568 | (LambdaExpr : LambdaExpr (le) -> LambdaExpr () 2569 | [(lambda (,x ,x* ...) (free (,f* ...) ,[body x f* -> body])) 2570 | `(lambda (,x ,x* ...) ,body)])) 2571 | 2572 | ;;; pass: lift-lambdas : L13 -> L14 2573 | ;;; 2574 | ;;; lifts all of the labels and lambda expressions to a top-level labels 2575 | ;;; binding. when we generate C code, these will become top-level C 2576 | ;;; functions. 2577 | ;;; 2578 | ;;; Design decisions: This pass is written using mutation, largely to shorten 2579 | ;;; the code that would gather up the label and lambda expression lists. 2580 | ;;; Another approach would be to gather these up by returning extra values 2581 | ;;; from each expression that has the list of labels and lambda expressions. 2582 | ;;; This would be simpler if the nanopass framework supported a way to flow 2583 | ;;; extra values through the data, but it doesn't currently support this 2584 | ;;; (it's on my feature todo list :). 2585 | ;;; 2586 | (define-pass lift-lambdas : L13 (e) -> L14 () 2587 | (definitions 2588 | (define *l* '()) 2589 | (define *le* '())) 2590 | (Expr : Expr (e) -> Expr () 2591 | [(labels ([,l* ,[le*]] ...) ,[body]) 2592 | (set! *l* (append l* *l*)) 2593 | (set! *le* (append le* *le*)) 2594 | body]) 2595 | (let ([e (Expr e)] [l (unique-var 'l:program)]) 2596 | `(labels ([,l (lambda () ,e)] [,*l* ,*le*] ...) ,l))) 2597 | 2598 | ;;; pass: remove-complex-opera* : L14 -> L15 2599 | ;;; 2600 | ;;; this pass removes nested complex operators. strictly speaking, this is 2601 | ;;; not something that we need to do since C is our target, however if we 2602 | ;;; want to taret assembly or something like LLVM. If we target something 2603 | ;;; like JavaScript, however, we might want to eliminate this. 2604 | ;;; 2605 | ;;; one reason I like this pass, is that it is a very simple pass for 2606 | ;;; something that is relatively complicated because the nanopadd framework 2607 | ;;; is really able to do a lot of work for us. 2608 | ;;; 2609 | ;;; Design decision: If we decide to remove this pass, the C code generation 2610 | ;;; pass will have to be a bit smarter about how it generates code, because 2611 | ;;; we will then have complex arguments, however, any decent C compiler 2612 | ;;; should be able to keep up with the tricks we'd need to play. 2613 | ;;; 2614 | (define-pass remove-complex-opera* : L14 (e) -> L15 () 2615 | (definitions 2616 | (with-output-language (L15 Expr) 2617 | (define build-let 2618 | (lambda (x* e* body) 2619 | (if (null? x*) 2620 | body 2621 | `(let ([,x* ,e*] ...) ,body))))) 2622 | (define simplify* 2623 | (lambda (e* f) 2624 | (let loop ([e* e*] [t* '()] [te* '()] [re* '()]) 2625 | (if (null? e*) 2626 | (build-let t* te* (f (reverse re*))) 2627 | (let ([e (car e*)]) 2628 | (nanopass-case (L15 Expr) e 2629 | [,x (loop (cdr e*) t* te* (cons x re*))] 2630 | [(quote ,c) (loop (cdr e*) t* te* (cons e re*))] 2631 | [(label ,l) (loop (cdr e*) t* te* (cons e re*))] 2632 | [else (let ([t (make-tmp)]) 2633 | (loop (cdr e*) (cons t t*) 2634 | (cons e te*) (cons t re*)))]))))))) 2635 | (Expr : Expr (e) -> Expr () 2636 | [(primcall ,pr ,[e*] ...) 2637 | (simplify* e* 2638 | (lambda (e*) 2639 | `(primcall ,pr ,e* ...)))] 2640 | [(,[e] ,[e*] ...) 2641 | (simplify* (cons e e*) 2642 | (lambda (e*) 2643 | `(,(car e*) ,(cdr e*) ...)))])) 2644 | 2645 | ;;; pass: recognize-context : L15 -> L16 2646 | ;;; 2647 | ;;; This pass seperates the Expr into Value, Effect, and Predicate cases. 2648 | ;;; The basic idea is to recognize where we have primitive calls that are out 2649 | ;;; of place for the value that they produce, the effect they perform, or the 2650 | ;;; branching direction they cause us to select. This is partially necessary 2651 | ;;; because we are choosing our own represenation for values, which may not 2652 | ;;; be the same as C's representation, and because we require that each 2653 | ;;; procedure return a value. The basic idea is pretty simple, the body of a 2654 | ;;; procedure is in Value context, so this is the context we start in. When 2655 | ;;; we process an 'if' form, the test position is in predicate context. In 2656 | ;;; this context we need to produce a true or false value in C (i.e. 0 for 2657 | ;;; true, or a non-zero integer, usually 1, for true). If we are in Value 2658 | ;;; context and we encounter a 'begin' form, the expressions before the end 2659 | ;;; of the 'begin' form are in effect context. 2660 | ;;; 2661 | ;;; The rules are as follows: 2662 | ;;; In Value context: 2663 | ;;; (primcall effect-prim e* ...) => 2664 | ;;; (begin (primcall effect-prim e* ...) (primcall void)) 2665 | ;;; (primcall pred-prim e* ...) => 2666 | ;;; (if (primcall pred-prim e* ...) (quote #t) (quote #f)) 2667 | ;;; 2668 | ;;; In Effect context: 2669 | ;;; x => (nop) 2670 | ;;; (quote c) => (nop) 2671 | ;;; (label l) => (nop) 2672 | ;;; (primcall value-prim e* ...) => (nop) 2673 | ;;; (primcall effect-prim e* ...) => (nop) 2674 | ;;; 2675 | ;;; In Predicate context (remember in Scheme #f is the only false value): 2676 | ;;; x => (if (primcall = x #f) (false) (true)) 2677 | ;;; (quote #f) => (false) 2678 | ;;; (quote (not #f)) => (true) 2679 | ;;; (primcall value-prim e* ...) => 2680 | ;;; (if (let ([t (primcall value-prim e* ...)]) 2681 | ;;; (= t (quote #f))) 2682 | ;;; (false) 2683 | ;;; (true)) 2684 | ;;; (primcall effect-prim e* ...) => 2685 | ;;; (begin (primcall effect-prim e* ...) (true)) ; (void) is not #f! 2686 | ;;; (se se* ...) => 2687 | ;;; (if (let ([t (se se* ...)]) 2688 | ;;; (primcall = t (quote #f))) 2689 | ;;; (false) 2690 | ;;; (true)) 2691 | ;;; we also do a small optimization, if we see (true) or (false) in 2692 | ;;; the output of an 'if' test form, we choose either the consequent or 2693 | ;;; the alternative. 2694 | ;;; 2695 | ;;; Design decision: We could swap recognize-context and 2696 | ;;; remove-complex-expr*, which would allow us to avoid building the 'let' 2697 | ;;; form when a Value prim or procedure call appears in the Predicate 2698 | ;;; context. On the other hand, we would need to process three contexts of 2699 | ;;; Expr, and maintain the context separation. 2700 | ;;; 2701 | (define-pass recognize-context : L15 (e) -> L16 () 2702 | (Value : Expr (e) -> Value () 2703 | [(primcall ,pr ,[se*] ...) 2704 | (guard (value-primitive? pr)) 2705 | `(primcall ,pr ,se* ...)] 2706 | [(primcall ,pr ,[se*] ...) 2707 | (guard (predicate-primitive? pr)) 2708 | `(if (primcall ,pr ,se* ...) (quote #t) (quote #f))] 2709 | [(primcall ,pr ,[se*] ...) 2710 | (guard (effect-primitive? pr)) 2711 | `(begin (primcall ,pr ,se* ...) (primcall void))] 2712 | [(primcall ,pr ,se* ...) 2713 | (error who "unexpected primitive found ~a" pr)]) 2714 | (Effect : Expr (e) -> Effect () 2715 | [,se `(nop)] 2716 | [(primcall ,pr ,[se*] ...) 2717 | (guard (effect-primitive? pr)) 2718 | `(primcall ,pr ,se* ...)] 2719 | [(primcall ,pr ,[se*] ...) 2720 | (guard (or (value-primitive? pr) (predicate-primitive? pr))) 2721 | `(nop)] 2722 | [(primcall ,pr ,se* ...) 2723 | (error who "unexpected primitive found ~a" pr)]) 2724 | (Predicate : Expr (e) -> Predicate () 2725 | [(quote ,c) (if c `(true) `(false))] 2726 | [,x `(if (primcall eq? x (quote #f)) (false) (true))] 2727 | [(if ,[p0] ,[p1] ,[p2]) 2728 | (nanopass-case (L16 Predicate) p0 2729 | [(true) p1] 2730 | [(false) p2] 2731 | [else `(if ,p0 ,p1 ,p2)])] 2732 | [(,[se] ,[se*] ...) 2733 | (let ([t (make-tmp)]) 2734 | `(if (let ([,t (,se ,se* ...)]) 2735 | (primcall = ,t (quote #f))) 2736 | (false) 2737 | (true)))] 2738 | [(primcall ,pr ,[se*] ...) 2739 | (guard (predicate-primitive? pr)) 2740 | `(primcall ,pr ,se* ...)] 2741 | [(primcall ,pr ,[se*] ...) 2742 | (guard (effect-primitive? pr)) 2743 | `(begin (primcall ,pr ,se* ...) (true))] 2744 | [(primcall ,pr ,[se*] ...) 2745 | (guard (value-primitive? pr)) 2746 | (let ([t (make-tmp)]) 2747 | `(if (let ([,t (primcall ,pr ,se* ...)]) 2748 | (primcall eq? ,t (quote #f))) 2749 | (false) 2750 | (true)))] 2751 | [(primcall ,pr ,se* ...) 2752 | (error who "unexpected primitive found ~a" pr)])) 2753 | 2754 | ;;; pass: expose-allocation-primitives : L16 -> L17 2755 | ;;; 2756 | ;;; this pass replaces the primitives that allocate new Scheme data 2757 | ;;; structures with a generic alloc form that takes the number of bytes to 2758 | ;;; allocate and the tag to add. (We cheat a little on the number of bytes 2759 | ;;; by using the fact that our fixnum data type is going to be adjusted 2760 | ;;; appropriately from representing the number of words in the data structure 2761 | ;;; to the number of bytes in the data structure.) This will eliminate 2762 | ;;; primitive calls to make-vector, make-closure, box, and cons and replace 2763 | ;;; it with allocs and explicit sets. One thing to note is that in the case 2764 | ;;; of box and cons, we want to be sure that the arguments are evaluated 2765 | ;;; first, then the space is allocated, and finally the values are set in the 2766 | ;;; data structure. We do this because, while we can evaluate the arguments 2767 | ;;; in any order, however, we need to complete their evaluation before we 2768 | ;;; start executing the primitive. In our little compiler, we could get away 2769 | ;;; with cheating, but if we added a feature like call/cc our cheats would be 2770 | ;;; observable. 2771 | ;;; 2772 | (define-pass expose-allocation-primitives : L16 (e) -> L17 () 2773 | (Value : Value (v) -> Value () 2774 | [(primcall ,vpr ,[se]) 2775 | (case vpr 2776 | [(make-vector) 2777 | (nanopass-case (L17 SimpleExpr) se 2778 | [(quote ,c) 2779 | (target-fixnum? c) 2780 | (let ([t (make-tmp)]) 2781 | `(let ([,t (alloc ,vector-tag (quote ,(+ c 1)))]) 2782 | (begin 2783 | (primcall $vector-length-set! ,t (quote ,c)) 2784 | ,t)))] 2785 | [else (let ([t0 (make-tmp)] [t1 (make-tmp)] [t2 (make-tmp)]) 2786 | `(let ([,t0 ,se]) 2787 | (let ([,t1 (primcall + ,t0 (quote 1))]) 2788 | (let ([,t2 (alloc ,vector-tag ,t1)]) 2789 | (begin 2790 | (primcall $vector-length-set! ,t2 ,t0) 2791 | ,t2)))))])] 2792 | [(make-closure) 2793 | (nanopass-case (L17 SimpleExpr) se 2794 | [(quote ,c) 2795 | (guard (target-fixnum? c)) 2796 | `(alloc ,closure-tag (quote ,(+ c 1)))] 2797 | [else (error who 2798 | "expected constant argument for make-closure primcall ~a" 2799 | (unparse-L16 v))])] 2800 | [(box) 2801 | (let ([t0 (make-tmp)] [t1 (make-tmp)]) 2802 | `(let ([,t0 ,se]) 2803 | (let ([,t1 (alloc ,box-tag (quote 1))]) 2804 | (begin 2805 | (primcall set-box! ,t1 ,t0) 2806 | ,t1))))] 2807 | [else `(primcall ,vpr ,se)])] 2808 | [(primcall ,vpr ,[se0] ,[se1]) 2809 | (case vpr 2810 | [(cons) 2811 | (let ([t0 (make-tmp)] [t1 (make-tmp)] [t2 (make-tmp)]) 2812 | `(let ([,t0 ,se0] [,t1 ,se1]) 2813 | (let ([,t2 (alloc ,pair-tag (quote 2))]) 2814 | (begin 2815 | (primcall $set-car! ,t2 ,t0) 2816 | (primcall $set-cdr! ,t2 ,t1) 2817 | ,t2))))] 2818 | [else `(primcall ,vpr ,se0 ,se1)])])) 2819 | 2820 | ;;; pass: return-of-set! : L17 -> L18 2821 | ;;; 2822 | ;;; In this psss we remove the 'let' form and replace it with set!. While 2823 | ;;; this set! looks like the source-level set!, it really is not the same 2824 | ;;; thing, since each of our variables only ever receive one value over the 2825 | ;;; course of running the program. If we were compiling to assembly or LLVM, 2826 | ;;; these set!s would directly set the variable at its allocated position, 2827 | ;;; i.e. in a register or memory location. Here we leave the job of deciding 2828 | ;;; where to allocate each of our single-assignemnt variables. In this pass, 2829 | ;;; we also gather up all of the variables as locals, so that we can put our 2830 | ;;; variable declarations at the start of the C function. (This is not 2831 | ;;; required in a modern C compiler, but it does make our job easier, since 2832 | ;;; we don't have to worry about needing to create variables in C contexts 2833 | ;;; where it might not be allowed.) This latter job is what causes all of 2834 | ;;; the extra work, since there is not a good way to gather up the values 2835 | ;;; without returning from every form in each of our three contexts. 2836 | ;;; 2837 | ;;; Design decision: We could simplify this pass by putting it before the 2838 | ;;; recognize-context pass, but that would compilcate the recognize-context 2839 | ;;; pass. With all of these types of decisions, it is largely a balancing 2840 | ;;; act of managing the complexity of individual passes, to try to keep the 2841 | ;;; compiler as simple as possible. 2842 | ;;; 2843 | (define-pass return-of-set! : L17 (e) -> L18 () 2844 | (definitions 2845 | (with-output-language (L18 Effect) 2846 | (define build-set*! 2847 | (lambda (x* v* body build-begin) 2848 | (build-begin 2849 | (map (lambda (x v) `(set! ,x ,v)) x* v*) 2850 | body))))) 2851 | (SimpleExpr : SimpleExpr (se) -> SimpleExpr ('())) 2852 | (Value : Value (v) -> Value ('()) 2853 | (definitions 2854 | (define build-begin 2855 | (lambda (e* v) 2856 | (nanopass-case (L18 Value) v 2857 | [(begin ,e1* ... ,v) 2858 | (build-begin (append e* e1*) v)] 2859 | [else 2860 | (if (null? e*) 2861 | v 2862 | (let loop ([e* e*] [re* '()]) 2863 | (if (null? e*) 2864 | `(begin ,(reverse re*) ... ,v) 2865 | (let ([e (car e*)]) 2866 | (nanopass-case (L18 Effect) e 2867 | [(nop) (loop (cdr e*) re*)] 2868 | [(begin ,e0* ... ,e0) 2869 | (loop (append e0* (cons e0 (cdr e*))) re*)] 2870 | [else (loop (cdr e*) (cons (car e*) re*))])))))])))) 2871 | [(if ,[p0 var0*] ,[v1 var1*] ,[v2 var2*]) 2872 | (values `(if ,p0 ,v1 ,v2) (append var0* var1* var2*))] 2873 | [(begin ,[e* var**] ... ,[v var*]) 2874 | (values (build-begin e* v) (apply append var* var**))] 2875 | [(primcall ,vpr ,[se* var**] ...) 2876 | (values `(primcall ,vpr ,se* ...) (apply append var**))] 2877 | [(,[se var*] ,[se* var**] ...) 2878 | (values `(,se ,se* ...) (apply append var* var**))] 2879 | [(let ([,x* ,[v* var**]] ...) ,[body var*]) 2880 | (values 2881 | (build-set*! x* v* body build-begin) 2882 | (apply append x* var* var**))]) 2883 | (Effect : Effect (e) -> Effect ('()) 2884 | (definitions 2885 | (define build-begin 2886 | (lambda (e* e) 2887 | (nanopass-case (L18 Effect) e 2888 | [(begin ,e1* ... ,e) 2889 | (build-begin (append e* e1*) e)] 2890 | [else 2891 | (if (null? e*) 2892 | e 2893 | (let loop ([e* e*] [re* '()]) 2894 | (if (null? e*) 2895 | `(begin ,(reverse re*) ... ,e) 2896 | (let ([e (car e*)]) 2897 | (nanopass-case (L18 Effect) e 2898 | [(nop) (loop (cdr e*) re*)] 2899 | [(begin ,e0* ... ,e0) 2900 | (loop (append e0* (cons e0 (cdr e*))) re*)] 2901 | [else (loop (cdr e*) (cons (car e*) re*))])))))])))) 2902 | [(if ,[p0 var0*] ,[e1 var1*] ,[e2 var2*]) 2903 | (values `(if ,p0 ,e1 ,e2) (append var0* var1* var2*))] 2904 | [(begin ,[e* var**] ... ,[e var*]) 2905 | (values (build-begin e* e) (apply append var* var**))] 2906 | [(primcall ,epr ,[se* var**] ...) 2907 | (values `(primcall ,epr ,se* ...) (apply append var**))] 2908 | [(,[se var*] ,[se* var**] ...) 2909 | (values `(,se ,se* ...) (apply append var* var**))] 2910 | [(let ([,x* ,[v* var**]] ...) ,[e var*]) 2911 | (values 2912 | (build-set*! x* v* e build-begin) 2913 | (apply append x* var* var**))]) 2914 | (Predicate : Predicate (p) -> Predicate ('()) 2915 | (definitions 2916 | (define build-begin 2917 | (lambda (e* p) 2918 | (nanopass-case (L18 Predicate) p 2919 | [(begin ,e1* ... ,p) 2920 | (build-begin (append e* e1*) p)] 2921 | [else 2922 | (if (null? e*) 2923 | p 2924 | (let loop ([e* e*] [re* '()]) 2925 | (if (null? e*) 2926 | `(begin ,(reverse re*) ... ,p) 2927 | (let ([e (car e*)]) 2928 | (nanopass-case (L18 Effect) e 2929 | [(nop) (loop (cdr e*) re*)] 2930 | [(begin ,e0* ... ,e0) 2931 | (loop (append e0* (cons e0 (cdr e*))) re*)] 2932 | [else (loop (cdr e*) (cons (car e*) re*))])))))])))) 2933 | [(if ,[p0 var0*] ,[p1 var1*] ,[p2 var2*]) 2934 | (values `(if ,p0 ,p1 ,p2) (append var0* var1* var2*))] 2935 | [(begin ,[e* var**] ... ,[p var*]) 2936 | (values (build-begin e* p) (apply append var* var**))] 2937 | [(primcall ,ppr ,[se* var**] ...) 2938 | (values `(primcall ,ppr ,se* ...) (apply append var**))] 2939 | [(let ([,x* ,[v* var**]] ...) ,[p var*]) 2940 | (values 2941 | (build-set*! x* v* p build-begin) 2942 | (apply append x* var* var**))]) 2943 | (LambdaExpr : LambdaExpr (le) -> LambdaExpr () 2944 | [(lambda (,x* ...) ,[body var*]) 2945 | `(lambda (,x* ...) (locals (,var* ...) ,body))])) 2946 | 2947 | ;;; pass: flatten-set! : L18 -> L19 2948 | ;;; 2949 | ;;; In the previous pass we remove the 'let' form, but we now may have set! 2950 | ;;; expressions on the right-hand side of a set!, such as the following: 2951 | ;;; 2952 | ;;; (set! x.0 (begin 2953 | ;;; (set! y.1 5) 2954 | ;;; (set! z.2 7) 2955 | ;;; (primcall + y.1 z.2))) 2956 | ;;; 2957 | ;;; However, while this is legal in C, we'd like to avoid this, which will 2958 | ;;; help us generate a little easier to read code, and again if we were 2959 | ;;; targeting something like assembly, would be required. We can transform 2960 | ;;; our example above into: 2961 | ;;; 2962 | ;;; (begin 2963 | ;;; (set! y.1 5) 2964 | ;;; (set! z.2 7) 2965 | ;;; (set! x.0 (primcall + y.1 z.2))) 2966 | ;;; 2967 | (define-pass flatten-set! : L18 (e) -> L19 () 2968 | (SimpleExpr : SimpleExpr (se) -> SimpleExpr ()) 2969 | (Effect : Effect (e) -> Effect () 2970 | [(set! ,x ,v) (flatten v x)]) 2971 | (flatten : Value (v x) -> Effect () 2972 | [,se `(set! ,x ,(SimpleExpr se))] 2973 | [(primcall ,vpr ,[se*] ...) `(set! ,x (primcall ,vpr ,se* ...))] 2974 | [(alloc ,i ,[se]) `(set! ,x (alloc ,i ,se))] 2975 | [(,[se] ,[se*] ...) `(set! ,x (,se ,se* ...))])) 2976 | 2977 | ;;; pass: push-if : L19 -> L20 2978 | ;;; 2979 | ;;; It turns out I was a little overzealous with this pass and didn't quite 2980 | ;;; handle all of the cases. In particular, in my hustle, I did not think 2981 | ;;; about the `(if p0 p1 p2) where the result expressions contain effects... 2982 | ;;; i.e. (if (begin ,e0* ... ,p0) (begin ,e1* ... ,p1) (begin ,e2* ... 2983 | ;;; ,p2)) can only be handled if: 2984 | ;;; 1. we are willing to copy the code for the tail of our ifs (we aren't, 2985 | ;;; this can lead to exponential code explosion) or 2986 | ;;; 2. if we are willing to flatten this code and use labels and gotos in 2987 | ;;; our generated code. 2988 | ;;; Number 2 is a more reasonable solution, but lucky for us, C will allow us 2989 | ;;; to generate code like the following: 2990 | ;;; 2991 | ;;; (if (begin ,e0* ... ,p0) (begin ,e1* ... ,p1) (begin ,e2* ... ,p2)) => 2992 | ;;; 2993 | ;;; (((e0*[0]), (e0*[1]), ..., (e0*[n]), p0) ? 2994 | ;;; ((e1*[0]), (e1*[1]), ..., (e1*[n]), p1) : 2995 | ;;; ((e2*[0]), (e2*[1]), ..., (e2*[n]), p2)) 2996 | ;;; 2997 | ;;; I've left the pass here as an example that even when we think we've got a 2998 | ;;; pass written and working, it easy to miss things, which is why we test, 2999 | ;;; and why we need to think carefully as we work through the compiler. 3000 | ;;; 3001 | ; (define-pass push-if : L19 (e) -> L20 () 3002 | ; (Value : Value (v) -> Value () 3003 | ; (definitions 3004 | ; (define build-begin 3005 | ; (lambda (e* v) 3006 | ; (if (null? e*) v `(begin ,e* ... ,v))))) 3007 | ; [(if ,[p0 e*] ,[v1] ,[v2]) (build-begin e* `(if ,p0 ,v1 ,v2))]) 3008 | ; (Effect : Effect (e) -> Effect () 3009 | ; (definitions 3010 | ; (define build-begin 3011 | ; (lambda (e* e) 3012 | ; (if (null? e*) e `(begin ,e* ... ,e))))) 3013 | ; [(if ,[p0 e*] ,[e1] ,[e2]) (build-begin e* `(if ,p0 ,e1 ,e2))]) 3014 | ; (Predicate : Predicate (p) -> Predicate ('()) 3015 | ; [(begin ,[e*] ... ,[p more-e*]) (values p (append e* more-e*))] 3016 | ; [(if ,[p0 e0*] ,[p1 e1*] ,[p2 e2*]) 3017 | ; (values `(if ,p0 (begin ,e1* ... p1) (begin ,e2* ... ,p2)) e0*)])) 3018 | 3019 | ;;; pass: specify-constant-representation : L19 -> L21 3020 | ;;; 3021 | ;;; This pass replaces our quoted constants with the explicit ptr 3022 | ;;; representation we've decided to use. This effectively replaces each of our 3023 | ;;; constants with a 64-bit integer. The conversion is pretty simple: 3024 | ;;; 3025 | ;;; #f => false-rep 3026 | ;;; #t => true-rep 3027 | ;;; '() => null-rep 3028 | ;;; fixnum => fixnum << fixnum-shift (yielding 64-bit integer) 3029 | ;;; 3030 | (define-pass specify-constant-representation : L19 (e) -> L21 () 3031 | (SimpleExpr : SimpleExpr (se) -> SimpleExpr () 3032 | [(quote ,c) 3033 | (cond 3034 | [(eq? c #f) false-rep] 3035 | [(eq? c #t) true-rep] 3036 | [(null? c) null-rep] 3037 | [(target-fixnum? c) 3038 | (arithmetic-shift c fixnum-shift)])])) 3039 | 3040 | ;;; pass: expand-primitives : L21 -> L22 3041 | ;;; 3042 | ;;; this pass expands our Scheme primitives into something close to their 3043 | ;;; C-language equivalents. This changes our math primitives to do the 3044 | ;;; adjustments required by changing the representation of fixnums (it works 3045 | ;;; fine for + and -, but * and / require us to do some shifting in order to 3046 | ;;; have a fixnum as a result). We also translate all of our memory 3047 | ;;; referencing primitives to mrefs and memory setting primitives into 3048 | ;;; msets!. When we generate C code for these, we will do the pointer 3049 | ;;; arithmetic required and then dereference the calculated address. 3050 | ;;; Remember, that because of our tags, we need to do some pointer arithmetic 3051 | ;;; for any dereference we wish to perform. This pointer arithmetic, though, 3052 | ;;; can be handled in a single memory reference argument on an x86_64 (which 3053 | ;;; is our assumed target platform). 3054 | ;;; 3055 | ;;; Design decision: Right now each of our "instructions" is a separate form 3056 | ;;; in the language, however, if we were to extend our source language and 3057 | ;;; primitive set much farther, it is likely that we would want to revisit 3058 | ;;; this to choose a representation where a single form could represent 3059 | ;;; several of these instructions. This might also be desirable if we change 3060 | ;;; the representation to LLVM or asm.js. 3061 | ;;;; 3062 | (define-pass expand-primitives : L21 (e) -> L22 () 3063 | (Value : Value (v) -> Value () 3064 | (definitions 3065 | (define build-begin 3066 | (lambda (e* v) 3067 | (nanopass-case (L22 Value) v 3068 | [(begin ,e1* ... ,v) 3069 | (build-begin (append e* e1*) v)] 3070 | [else 3071 | (if (null? e*) 3072 | v 3073 | (let loop ([e* e*] [re* '()]) 3074 | (if (null? e*) 3075 | `(begin ,(reverse re*) ... ,v) 3076 | (let ([e (car e*)]) 3077 | (nanopass-case (L22 Effect) e 3078 | [(nop) (loop (cdr e*) re*)] 3079 | [(begin ,e0* ... ,e0) 3080 | (loop (append e0* (cons e0 (cdr e*))) re*)] 3081 | [else (loop (cdr e*) (cons (car e*) re*))])))))])))) 3082 | [(begin ,[e*] ... ,[v]) (build-begin e* v)]) 3083 | (Rhs : Rhs (rhs) -> Rhs () 3084 | [(primcall ,vpr) 3085 | (case vpr 3086 | [(void) void-rep] 3087 | [else (error who "unexpected value primitive ~a" vpr)])] 3088 | [(primcall ,vpr ,[se]) 3089 | (case vpr 3090 | [(car) `(mref ,se #f ,(- pair-tag))] 3091 | [(cdr) `(mref ,se #f ,(- word-size pair-tag))] 3092 | [(unbox) `(mref ,se #f ,(- box-tag))] 3093 | [(closure-code) `(mref ,se #f ,(- closure-tag))] 3094 | [(vector-length) `(mref ,se #f ,(- vector-tag))] 3095 | [else (error who "unexpected value primitive ~a" vpr)])] 3096 | [(primcall ,vpr ,[se0] ,[se1]) 3097 | (case vpr 3098 | [(closure-ref) `(mref ,se0 ,se1 ,(- word-size closure-tag))] 3099 | [(vector-ref) `(mref ,se0 ,se1 ,(- word-size vector-tag))] 3100 | [(+) `(add ,se0 ,se1)] 3101 | [(-) `(subtract ,se0 ,se1)] 3102 | ;; when we multiply or divide, we need to shift either one of the 3103 | ;; arguments or the result. we could also be a bit more clever here, 3104 | ;; if one of the arguments is a constant, we can perform the shift 3105 | ;; ahead of time (assuming the constant still fits within the 64-bit 3106 | ;; width 3107 | [(*) `(multiply ,se0 (shift-right ,se1 ,fixnum-shift))] 3108 | [(/) `(shift-left (divide ,se0 ,se1) ,fixnum-shift)] 3109 | [else (error who "unexpected value primitive ~a" vpr)])] 3110 | [(primcall ,vpr ,se* ...) 3111 | (error who "unexpected value primitive ~a" vpr)]) 3112 | (Effect : Effect (e) -> Effect () 3113 | (definitions 3114 | (define build-begin 3115 | (lambda (e* e) 3116 | (nanopass-case (L22 Effect) e 3117 | [(begin ,e1* ... ,e) 3118 | (build-begin (append e* e1*) e)] 3119 | [else 3120 | (if (null? e*) 3121 | e 3122 | (let loop ([e* e*] [re* '()]) 3123 | (if (null? e*) 3124 | `(begin ,(reverse re*) ... ,e) 3125 | (let ([e (car e*)]) 3126 | (nanopass-case (L22 Effect) e 3127 | [(nop) (loop (cdr e*) re*)] 3128 | [(begin ,e0* ... ,e0) 3129 | (loop (append e0* (cons e0 (cdr e*))) re*)] 3130 | [else (loop (cdr e*) (cons (car e*) re*))])))))])))) 3131 | [(begin ,[e*] ... ,[e]) (build-begin e* e)] 3132 | [(primcall ,epr ,[se0] ,[se1]) 3133 | (case epr 3134 | [(set-box!) `(mset! ,se0 #f ,(- box-tag) ,se1)] 3135 | [($set-car!) `(mset! ,se0 #f ,(- pair-tag) ,se1)] 3136 | [($set-cdr!) `(mset! ,se0 #f ,(- word-size pair-tag) ,se1)] 3137 | [($vector-length-set!) `(mset! ,se0 #f ,(- vector-tag) ,se1)] 3138 | [(closure-code-set!) `(mset! ,se0 #f ,(- closure-tag) ,se1)] 3139 | [else (error who "unexpected effect primitive ~a" epr)])] 3140 | [(primcall ,epr ,[se0] ,[se1] ,[se2]) 3141 | (case epr 3142 | [(vector-set!) `(mset! ,se0 ,se1 ,(- word-size vector-tag) ,se2)] 3143 | [(closure-data-set!) 3144 | `(mset! ,se0 ,se1 ,(- word-size closure-tag) ,se2)] 3145 | [else (error who "unexpected effect primitive ~a" epr)])] 3146 | [(primcall ,epr ,se* ...) 3147 | (error who "unexpected effect primitive ~a" epr)]) 3148 | (Predicate : Predicate (p) -> Predicate () 3149 | (definitions 3150 | (define build-begin 3151 | (lambda (e* p) 3152 | (nanopass-case (L22 Predicate) p 3153 | [(begin ,e1* ... ,p) 3154 | (build-begin (append e* e1*) p)] 3155 | [else 3156 | (if (null? e*) 3157 | p 3158 | (let loop ([e* e*] [re* '()]) 3159 | (if (null? e*) 3160 | `(begin ,(reverse re*) ... ,p) 3161 | (let ([e (car e*)]) 3162 | (nanopass-case (L22 Effect) e 3163 | [(nop) (loop (cdr e*) re*)] 3164 | [(begin ,e0* ... ,e0) 3165 | (loop (append e0* (cons e0 (cdr e*))) re*)] 3166 | [else (loop (cdr e*) (cons (car e*) re*))])))))])))) 3167 | [(begin ,[e*] ... ,[p]) (build-begin e* p)] 3168 | [(primcall ,ppr ,[se]) 3169 | (case ppr 3170 | [(pair?) `(= (logand ,se ,pair-mask) ,pair-tag)] 3171 | [(null?) `(= ,se ,null-rep)] 3172 | [(boolean?) `(= (logand ,se ,boolean-mask) ,boolean-tag)] 3173 | [(vector?) `(= (logand ,se ,vector-mask) ,vector-tag)] 3174 | [(box?) `(= (logand ,se ,box-mask) ,box-tag)] 3175 | [else (error who "unexpected predicate primitive ~a" ppr)])] 3176 | [(primcall ,ppr ,[se0] ,[se1]) 3177 | (case ppr 3178 | [(eq? =) `(= ,se0 ,se1)] 3179 | [(<) `(< ,se0 ,se1)] 3180 | [(<=) `(<= ,se0 ,se1)] 3181 | [(>) `(<= ,se1 ,se0)] 3182 | [(>=) `(< ,se1 ,se0)] 3183 | [else (error who "unexpected predicate primitive ~a" ppr)])] 3184 | [(primcall ,ppr ,se* ...) 3185 | (error who "unexpected predicate primitive ~a" ppr)])) 3186 | 3187 | ;;; pass: generate-C : L22 -> printed-output 3188 | ;;; 3189 | ;;; this pass takes a program in the L22 language and produces a printed C 3190 | ;;; program. using a string or file port, the results of this can be 3191 | ;;; captured in a string or sent to a file to be compiled. The code that it 3192 | ;;; produces can be a little difficult to read, particularly with all of the 3193 | ;;; casts to and from ptr values. 3194 | ;;; 3195 | ;;; TODO: this pass is fairly convoluted, and could use some refactoring. We 3196 | ;;; might also want to try to pretty-print the C code so that it prints 3197 | ;;; out a bit better. 3198 | ;;; 3199 | (define-pass generate-c : L22 (e) -> * () 3200 | (definitions 3201 | (define string-join 3202 | (lambda (str* jstr) 3203 | (cond 3204 | [(null? str*) ""] 3205 | [(null? (cdr str*)) (car str*)] 3206 | [else (string-append (car str*) jstr (string-join (cdr str*) jstr))]))) 3207 | ;;; symbol->c-id - converts any Scheme symbol into a valid C identifier. 3208 | (define symbol->c-id 3209 | (lambda (sym) 3210 | (let ([ls (string->list (symbol->string sym))]) 3211 | (if (null? ls) 3212 | "_" 3213 | (let ([fst (car ls)]) 3214 | (list->string 3215 | (cons 3216 | (if (char-alphabetic? fst) fst #\_) 3217 | (map (lambda (c) 3218 | (if (or (char-alphabetic? c) 3219 | (char-numeric? c)) 3220 | c 3221 | #\_)) 3222 | (cdr ls))))))))) 3223 | ;;; emit-function-header - generates a function header to be used in the 3224 | ;;; declaration of a function or the definition of a function. 3225 | (define format-function-header 3226 | (lambda (l x*) 3227 | (format "ptr ~a(~a)" l 3228 | (string-join 3229 | (map 3230 | (lambda (x) 3231 | (format "ptr ~a" (symbol->c-id x))) 3232 | x*) 3233 | ", ")))) 3234 | (define format-label-call 3235 | (lambda (l se*) 3236 | (format " ~a(~a)" (symbol->c-id l) 3237 | (string-join 3238 | (map (lambda (se) 3239 | (format "(ptr)~a" (format-simple-expr se))) 3240 | se*) 3241 | ", ")))) 3242 | (define format-general-call 3243 | (lambda (se se*) 3244 | (format "((ptr (*)(~a))~a)(~a)" 3245 | (string-join (make-list (length se*) "ptr") ", ") 3246 | (format-simple-expr se) 3247 | (string-join 3248 | (map (lambda (se) 3249 | (format "(ptr)~a" (format-simple-expr se))) 3250 | se*) 3251 | ", ")))) 3252 | (define format-binop 3253 | (lambda (op se0 se1) 3254 | (format "((long)~a ~a (long)~a)" 3255 | (format-simple-expr se0) 3256 | op 3257 | (format-simple-expr se1)))) 3258 | (define format-set! 3259 | (lambda (x rhs) 3260 | (format "~a = (ptr)~a" (symbol->c-id x) (format-rhs rhs))))) 3261 | ;; transformer to print our function declarations 3262 | (emit-function-decl : LambdaExpr (le l) -> * () 3263 | [(lambda (,x* ...) ,lbody) 3264 | (printf "~a;~%" (format-function-header l x*))]) 3265 | ;; transformer to print our function definitions 3266 | (emit-function-def : LambdaExpr (le l) -> * () 3267 | [(lambda (,x* ...) ,lbody) 3268 | (printf "~a {~%" (format-function-header l x*)) 3269 | (emit-function-body lbody) 3270 | (printf "}~%~%")]) 3271 | ;; transformer to emit the body of a function 3272 | (emit-function-body : LocalsBody (lbody) -> * () 3273 | [(locals (,x* ...) ,body) 3274 | (for-each (lambda (x) (printf " ptr ~a;~%" (symbol->c-id x))) x*) 3275 | (emit-value body x*)]) 3276 | ;; transformer to emit expressions in value context 3277 | (emit-value : Value (v locals*) -> * () 3278 | [(if ,p0 ,v1 ,v2) 3279 | (printf " if (~a) {~%" (format-predicate p0)) 3280 | (emit-value v1 locals*) 3281 | (printf " } else {~%") 3282 | (emit-value v2 locals*) 3283 | (printf " }~%")] 3284 | [(begin ,e* ... ,v) 3285 | (for-each emit-effect e*) 3286 | (emit-value v locals*)] 3287 | [,rhs (printf " return (ptr)~a;\n" (format-rhs rhs))]) 3288 | ;; transformer to format Predicate expressions into strings 3289 | (format-predicate : Predicate (p) -> * (str) 3290 | [(if ,p0 ,p1 ,p2) 3291 | (format "((~a) ? (~a) : (~a))" 3292 | (format-predicate p0) 3293 | (format-predicate p1) 3294 | (format-predicate p2))] 3295 | [(<= ,se0 ,se1) (format-binop "<=" se0 se1)] 3296 | [(< ,se0 ,se1) (format-binop "<" se0 se1)] 3297 | [(= ,se0 ,se1) (format-binop "==" se0 se1)] 3298 | [(true) "1"] 3299 | [(false) "0"] 3300 | [(begin ,e* ... ,p) 3301 | (string-join 3302 | (foldr (lambda (e s*) (cons (format-effect e) s*)) 3303 | (list (format-predicate p)) e*) 3304 | ", ")]) 3305 | ;; transformer to format effects in predicate context into strings 3306 | (format-effect : Effect (e) -> * (str) 3307 | [(if ,p0 ,e1 ,e2) 3308 | (format "((~a) ? (~a) : (~a))" 3309 | (format-predicate p0) 3310 | (format-effect e1) 3311 | (format-effect e2))] 3312 | [((label ,l) ,se* ...) (format-label-call l se*)] 3313 | [(,se ,se* ...) (format-general-call se se*)] 3314 | [(set! ,x ,rhs) (format-set! x rhs)] 3315 | [(nop) "0"] 3316 | [(begin ,e* ... ,e) 3317 | (string-join 3318 | (foldr (lambda (e s*) (cons (format-effect e) s*)) 3319 | (list (format-effect e)) e*) 3320 | ", ")] 3321 | [(mset! ,se0 ,se1? ,i ,se2) 3322 | (if se1? 3323 | (format "((*((ptr*)((long)~a + (long)~a + ~a))) = (ptr)~a)" 3324 | (format-simple-expr se0) (format-simple-expr se1?) 3325 | i (format-simple-expr se2)) 3326 | (format "((*((ptr*)((long)~a + ~a))) = (ptr)~a)" 3327 | (format-simple-expr se0) i (format-simple-expr se2)))]) 3328 | ;; formats simple expressions in to strings 3329 | (format-simple-expr : SimpleExpr (se) -> * (str) 3330 | [,x (symbol->c-id x)] 3331 | [,i (number->string i)] 3332 | [(label ,l) (format "(*~a)" (symbol->c-id l))] 3333 | [(logand ,se0 ,se1) (format-binop "&" se0 se1)] 3334 | [(shift-right ,se0 ,se1) (format-binop ">>" se0 se1)] 3335 | [(shift-left ,se0 ,se1) (format-binop "<<" se0 se1)] 3336 | [(divide ,se0 ,se1) (format-binop "/" se0 se1)] 3337 | [(multiply ,se0 ,se1) (format-binop "*" se0 se1)] 3338 | [(subtract ,se0 ,se1) (format-binop "-" se0 se1)] 3339 | [(add ,se0 ,se1) (format-binop "+" se0 se1)] 3340 | [(mref ,se0 ,se1? ,i) 3341 | (if se1? 3342 | (format "(*((ptr)((long)~a + (long)~a + ~a)))" 3343 | (format-simple-expr se0) 3344 | (format-simple-expr se1?) i) 3345 | (format "(*((ptr)((long)~a + ~a)))" (format-simple-expr se0) i))]) 3346 | ;; prints expressions in effect position into C statements 3347 | (emit-effect : Effect (e) -> * () 3348 | [(if ,p0 ,e1 ,e2) 3349 | (printf " if (~a) {~%" (format-predicate p0)) 3350 | (emit-effect e1) 3351 | (printf " } else {~%") 3352 | (emit-effect e2) 3353 | (printf " }~%")] 3354 | [((label ,l) ,se* ...) (printf " ~a;\n" (format-label-call l se*))] 3355 | [(,se ,se* ...) (printf " ~a;\n" (format-general-call se se*))] 3356 | [(set! ,x ,rhs) (printf " ~a;\n" (format-set! x rhs))] 3357 | [(nop) (void)] 3358 | [(begin ,e* ... ,e) 3359 | (for-each emit-effect e*) 3360 | (emit-effect e)] 3361 | [(mset! ,se0 ,se1? ,i ,se2) 3362 | (if se1? 3363 | (printf "(*((ptr*)((long)~a + (long)~a + ~a))) = (ptr)~a;\n" 3364 | (format-simple-expr se0) (format-simple-expr se1?) 3365 | i (format-simple-expr se2)) 3366 | (printf "(*((ptr*)((long)~a + ~a))) = (ptr)~a;\n" 3367 | (format-simple-expr se0) i (format-simple-expr se2)))]) 3368 | ;; formats the right-hand side of a set! into a C expression 3369 | (format-rhs : Rhs (rhs) -> * (str) 3370 | [((label ,l) ,se* ...) (format-label-call l se*)] 3371 | [(,se ,se* ...) (format-general-call se se*)] 3372 | [(alloc ,i ,se) 3373 | (if (use-boehm?) 3374 | (format "(ptr)((long)GC_MALLOC(~a) + ~al)" 3375 | (format-simple-expr se) i) 3376 | (format "(ptr)((long)malloc(~a) + ~al)" 3377 | (format-simple-expr se) i))] 3378 | [,se (format-simple-expr se)]) 3379 | ;; emits a C program for our progam expression 3380 | (Program : Program (p) -> * () 3381 | [(labels ([,l* ,le*] ...) ,l) 3382 | (let ([l (symbol->c-id l)] [l* (map symbol->c-id l*)]) 3383 | (define-syntax emit-include 3384 | (syntax-rules () 3385 | [(_ name) (printf "#include <~s>\n" 'name)])) 3386 | (define-syntax emit-predicate 3387 | (syntax-rules () 3388 | [(_ PRED_P mask tag) 3389 | (emit-c-macro PRED_P (x) "(((long)x & ~a) == ~a)" mask tag)])) 3390 | (define-syntax emit-eq-predicate 3391 | (syntax-rules () 3392 | [(_ PRED_P rep) 3393 | (emit-c-macro PRED_P (x) "((long)x == ~a)" rep)])) 3394 | (define-syntax emit-c-macro 3395 | (lambda (x) 3396 | (syntax-case x() 3397 | [(_ NAME (x* ...) fmt args ...) 3398 | #'(printf "#define ~s(~a) ~a\n" 'NAME 3399 | (string-join (map symbol->string '(x* ...)) ", ") 3400 | (format fmt args ...))]))) 3401 | ;; the following printfs output the tiny C runtime we are using 3402 | ;; to wrap the result of our compiled Scheme program. 3403 | (emit-include stdio.h) 3404 | (if (use-boehm?) 3405 | (emit-include gc.h) 3406 | (emit-include stdlib.h)) 3407 | (emit-predicate FIXNUM_P fixnum-mask fixnum-tag) 3408 | (emit-predicate PAIR_P pair-mask pair-tag) 3409 | (emit-predicate BOX_P box-mask box-tag) 3410 | (emit-predicate VECTOR_P vector-mask vector-tag) 3411 | (emit-predicate PROCEDURE_P closure-mask closure-tag) 3412 | (emit-eq-predicate TRUE_P true-rep) 3413 | (emit-eq-predicate FALSE_P false-rep) 3414 | (emit-eq-predicate NULL_P null-rep) 3415 | (emit-eq-predicate VOID_P void-rep) 3416 | (printf "typedef long* ptr;\n") 3417 | (emit-c-macro FIX (x) "((long)x << ~a)" fixnum-shift) 3418 | (emit-c-macro UNFIX (x) "((long)x >> ~a)" fixnum-shift) 3419 | (emit-c-macro UNBOX (x) "((ptr)*((ptr)((long)x - ~a)))" box-tag) 3420 | (emit-c-macro VECTOR_LENGTH_S (x) "((ptr)*((ptr)((long)x - ~a)))" vector-tag) 3421 | (emit-c-macro VECTOR_LENGTH_C (x) "UNFIX(VECTOR_LENGTH_S(x))") 3422 | (emit-c-macro VECTOR_REF (x i) "((ptr)*((ptr)((long)x - ~a + ((i+1) * ~a))))" vector-tag word-size) 3423 | (emit-c-macro CAR (x) "((ptr)*((ptr)((long)x - ~a)))" pair-tag) 3424 | (emit-c-macro CDR (x) "((ptr)*((ptr)((long)x - ~a + ~a)))" pair-tag word-size) 3425 | (printf "void print_scheme_value(ptr x) {\n") 3426 | (printf " long i, veclen;\n") 3427 | (printf " ptr p;\n") 3428 | (printf " if (TRUE_P(x)) {\n") 3429 | (printf " printf(\"#t\");\n") 3430 | (printf " } else if (FALSE_P(x)) {\n") 3431 | (printf " printf(\"#f\");\n") 3432 | (printf " } else if (NULL_P(x)) {\n") 3433 | (printf " printf(\"()\");\n") 3434 | (printf " } else if (VOID_P(x)) {\n") 3435 | (printf " printf(\"(void)\");\n") 3436 | (printf " } else if (FIXNUM_P(x)) {\n") 3437 | (printf " printf(\"%ld\", UNFIX(x));\n") 3438 | (printf " } else if (PAIR_P(x)) {\n") 3439 | (printf " printf(\"(\");\n") 3440 | (printf " for (p = x; PAIR_P(p); p = CDR(p)) {\n") 3441 | (printf " print_scheme_value(CAR(p));\n") 3442 | (printf " if (PAIR_P(CDR(p))) { printf(\" \"); }\n") 3443 | (printf " }\n") 3444 | (printf " if (NULL_P(p)) {\n") 3445 | (printf " printf(\")\");\n") 3446 | (printf " } else {\n") 3447 | (printf " printf(\" . \");\n") 3448 | (printf " print_scheme_value(p);\n") 3449 | (printf " printf(\")\");\n") 3450 | (printf " }\n") 3451 | (printf " } else if (BOX_P(x)) {\n") 3452 | (printf " printf(\"#(box \");\n") 3453 | (printf " print_scheme_value(UNBOX(x));\n") 3454 | (printf " printf(\")\");\n") 3455 | (printf " } else if (VECTOR_P(x)) {\n") 3456 | (printf " veclen = VECTOR_LENGTH_C(x);\n") 3457 | (printf " printf(\"#(\");\n") 3458 | (printf " for (i = 0; i < veclen; i += 1) {\n") 3459 | (printf " print_scheme_value(VECTOR_REF(x,i));\n") 3460 | (printf " if (i < veclen) { printf(\" \"); } \n") 3461 | (printf " }\n") 3462 | (printf " printf(\")\");\n") 3463 | (printf " } else if (PROCEDURE_P(x)) {\n") 3464 | (printf " printf(\"#(procedure)\");\n") 3465 | (printf " }\n") 3466 | (printf "}\n") 3467 | (map emit-function-decl le* l*) 3468 | (map emit-function-def le* l*) 3469 | (printf "int main(int argc, char * argv[]) {\n") 3470 | (printf " print_scheme_value(~a());\n" l) 3471 | (printf " printf(\"\\n\");\n") 3472 | (printf " return 0;\n") 3473 | (printf "}\n"))])) 3474 | 3475 | ;;; a little macro to make building a compiler with tracing that we can turn 3476 | ;;; off and on easier. no support for looping in this, but the syntax is very 3477 | ;;; simple: 3478 | ;;; (define-compiler my-compiler-name 3479 | ;;; (pass1 unparser) 3480 | ;;; (pass2 unparser) 3481 | ;;; ... 3482 | ;;; pass-to-generate-c) 3483 | ;;; 3484 | (define-syntax (define-compiler x) 3485 | (syntax-parse x 3486 | [(_ name (pass unparser) ... gen-c) 3487 | ;(with-implicit (#'name all-passes trace-passes) 3488 | #:with all-passes (format-id #'name "all-passes") 3489 | #:with trace-passes (format-id #'name "trace-passes") 3490 | #`(begin 3491 | (define all-passes '(pass ... gen-c)) 3492 | (define trace-passes 3493 | (let ([passes '()]) 3494 | (case-lambda 3495 | [() passes] 3496 | [(x) 3497 | (cond 3498 | [(symbol? x) 3499 | (unless (memq x all-passes) 3500 | (error 'trace-passes "invalid pass name ~a" x)) 3501 | (set! passes (list x))] 3502 | [(list? x) 3503 | (unless (andmap (lambda (x) (memq x all-passes)) x) 3504 | (error 'trace-passes 3505 | "one or more invalid pass names ~a" x)) 3506 | (set! passes x)] 3507 | [(eq? x #t) (set! passes all-passes)] 3508 | [(eq? x #f) (set! passes '())] 3509 | [else (error 'trace-passes 3510 | "invalid pass specifier ~a" x)])]))) 3511 | (define name 3512 | (lambda (x) 3513 | #,(let loop ([pass* #'(pass ...)] 3514 | [unparser* #'(unparser ...)]) 3515 | (if (null? pass*) 3516 | #'(begin 3517 | (when (file-exists? "t.c") (delete-file "t.c")) 3518 | (with-output-to-file "t.c" 3519 | (lambda () (gen-c x))) 3520 | (when (memq 'gen-c (trace-passes)) 3521 | (printf "output of pass ~s~%" 'gen-c) 3522 | (call-with-input-file "t.c" 3523 | (lambda (ip) 3524 | (let f () 3525 | (let ([s (read-string 512 ip)]) 3526 | (unless (eof-object? s) 3527 | (display s) 3528 | (f))))))) 3529 | (system 3530 | (format "gcc -m64 ~a t.c -o t" 3531 | (if (use-boehm?) "-lgc" ""))) 3532 | (when (file-exists? "t.out") 3533 | (delete-file "t.out")) 3534 | (system "./t > t.out") 3535 | (call-with-input-file "t.out" read)) 3536 | (with-syntax ([pass (stx-car pass*)] 3537 | [unparser (stx-car unparser*)]) 3538 | #`(let ([x (pass x)]) 3539 | (when (memq 'pass (trace-passes)) 3540 | (printf "output of pass ~s~%" 'pass) 3541 | (pretty-print (unparser x))) 3542 | #,(loop (stx-cdr pass*) 3543 | (stx-cdr unparser*)))))))))])) 3544 | 3545 | ;;; the definition of our compiler that pulls in all of our passes and runs 3546 | ;;; them in sequence checking to see if the programmer wants them traced. 3547 | (define-compiler my-tiny-compile 3548 | (parse-and-rename unparse-Lsrc) 3549 | (remove-one-armed-if unparse-L1) 3550 | (remove-and-or-not unparse-L2) 3551 | (make-begin-explicit unparse-L3) 3552 | (inverse-eta-raw-primitives unparse-L4) 3553 | (quote-constants unparse-L5) 3554 | (remove-complex-constants unparse-L6) 3555 | (identify-assigned-variables unparse-L7) 3556 | (purify-letrec unparse-L8) 3557 | (optimize-direct-call unparse-L8) 3558 | (find-let-bound-lambdas unparse-L8) 3559 | (remove-anonymous-lambda unparse-L9) 3560 | (convert-assignments unparse-L10) 3561 | (uncover-free unparse-L11) 3562 | (convert-closures unparse-L12) 3563 | (optimize-known-call unparse-L12) 3564 | (expose-closure-prims unparse-L13) 3565 | (lift-lambdas unparse-L14) 3566 | (remove-complex-opera* unparse-L15) 3567 | (recognize-context unparse-L16) 3568 | (expose-allocation-primitives unparse-L17) 3569 | (return-of-set! unparse-L18) 3570 | (flatten-set! unparse-L19) 3571 | ; (push-if unparse-L20) 3572 | (specify-constant-representation unparse-L21) 3573 | (expand-primitives unparse-L22) 3574 | generate-c) 3575 | --------------------------------------------------------------------------------