├── lfcs-2021-seminar-talk ├── Compilation as Multi-Language Semantics.key ├── Compilation as Multi-Language Semantics.pdf ├── anf-demo.rkt ├── jit-demo.rkt ├── aot-demo.rkt ├── abstract.txt └── snippets.tex ├── README.md ├── trace-anf-eg.rkt ├── prisc-extra.tex ├── lambda-mil.rkt ├── compose.rkt ├── Makefile ├── bib.rkt ├── paper.scrbl ├── al.rkt ├── anf-testing-confluence.rkt ├── lambda-h.rkt ├── base.rkt ├── code-gen.rkt ├── lambda-cc.rkt ├── hoist.rkt ├── cc2.rkt ├── notes.md ├── cc.rkt ├── lambda-a.rkt ├── lambda-b.rkt ├── source.scrbl ├── lambda-s.rkt ├── specify.rkt ├── anf.rkt ├── sketch.md ├── prisc2021-abstract.scrbl ├── defs.rkt └── a-normal-form.scrbl /lfcs-2021-seminar-talk/Compilation as Multi-Language Semantics.key: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wilbowma/multi-lang-comp/HEAD/lfcs-2021-seminar-talk/Compilation as Multi-Language Semantics.key -------------------------------------------------------------------------------- /lfcs-2021-seminar-talk/Compilation as Multi-Language Semantics.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wilbowma/multi-lang-comp/HEAD/lfcs-2021-seminar-talk/Compilation as Multi-Language Semantics.pdf -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | Compilation as Multi-Language Semantics 2 | == 3 | This repo is my notebook on studying multi-language semantics and compilation. 4 | 5 | The most recent document is the extended abstract, [prisc2021-abstract.scrbl](). 6 | 7 | [sketch.md]() is an early draft explaining the idea at a high level. 8 | [notes.md]() are misc notes. 9 | -------------------------------------------------------------------------------- /trace-anf-eg.rkt: -------------------------------------------------------------------------------- 1 | #lang racket/base 2 | 3 | (require 4 | "anf.rkt" 5 | redex/reduction-semantics 6 | redex/gui) 7 | 8 | (traces 9 | anf-eval->+ 10 | (term (() 11 | (AS (letrec ([fact (λ (n) 12 | (if (eq? n 0) 13 | 1 14 | (* n (fact (- n 1)))))]) 15 | (fact 5)))))) 16 | -------------------------------------------------------------------------------- /lfcs-2021-seminar-talk/anf-demo.rkt: -------------------------------------------------------------------------------- 1 | #lang racket/base 2 | 3 | (require 4 | redex/reduction-semantics 5 | "../anf.rkt" 6 | redex/gui) 7 | 8 | 9 | ;; Take second path; there's a bug in the first one. 10 | ;(stepper anf->+ (term (AS (let ([x (let ([y true]) y)]) x)))) 11 | (stepper anf->+ (term (AS (let ([x (begin (set! y true) y)]) x)))) 12 | 13 | (stepper anf->+ (term (AS (+ (if (let ([x #t]) x) 6 7) 1)))) 14 | -------------------------------------------------------------------------------- /lfcs-2021-seminar-talk/jit-demo.rkt: -------------------------------------------------------------------------------- 1 | #lang racket/base 2 | 3 | (require 4 | redex/reduction-semantics 5 | "../anf.rkt" 6 | redex/gui) 7 | 8 | (stepper anf-eval->+ (term (() (AS (let ([x (let ([y true]) y)]) x))))) 9 | 10 | (stepper anf-eval->+ (term (() (AS (+ (if (let ([x #t]) x) 6 7) 1))))) 11 | 12 | (stepper anf-eval->+ 13 | (term 14 | (() (AS (letrec ([fact (λ (n) 15 | (if (eq? n 0) 16 | 1 17 | (* n (fact (- n 1)))))]) 18 | (fact 5)))))) 19 | -------------------------------------------------------------------------------- /prisc-extra.tex: -------------------------------------------------------------------------------- 1 | \usepackage{microtype} 2 | \usepackage{doi} 3 | \usepackage{letltxmacro} 4 | \renewcommand{\Sdoi}[1]{\doi{#1}} 5 | 6 | \LetLtxMacro{\oldfigure}{\Figure} 7 | \LetLtxMacro{\oldendfigure}{\endFigure} 8 | 9 | \renewenvironment{Figure} 10 | {\oldfigure} 11 | {\vspace{-2ex}\oldendfigure} 12 | 13 | \setcopyright{none} 14 | \settopmatter{printacmref=true, printccs=false} 15 | \acmDOI{} 16 | \acmISBN{} 17 | \acmPrice{} 18 | \acmBooktitle{Workshop on Principles of Secure Compilation (PriSC)} 19 | \acmConference[PriSC]{Workshop on Principles of Secure Compilation}{January 2021}{Online} 20 | -------------------------------------------------------------------------------- /lambda-mil.rkt: -------------------------------------------------------------------------------- 1 | #lang racket/base 2 | (require redex/reduction-semantics) 3 | 4 | ; λmilL is the λ-calculus language in monadic form. 5 | (define-language λmilL 6 | [v ::= fixnum x] 7 | [n ::= (box v) (unbox v) (set-box! v) (cons v v) (car v) (cdr v) 8 | (+ v v) (* v v)] 9 | [m ::= (λ (x ...) n) (v v ...) x (begin m ... m) 10 | (let ([x m] ...) m) (letrec ([x m] ...) m)] 11 | [x ::= variable-not-otherwise-mentioned] 12 | #:binding-forms 13 | (λ (x ...) e #:refers-to (shadow x ...)) 14 | (letrec ([x e] ...) #:refers-to (shadow x ...) 15 | e #:refers-to (shadow x ...)) 16 | (let ([x e] ...) e #:refers-to (shadow x ...))) 17 | -------------------------------------------------------------------------------- /lfcs-2021-seminar-talk/aot-demo.rkt: -------------------------------------------------------------------------------- 1 | #lang racket/base 2 | 3 | (require 4 | redex/reduction-semantics 5 | "../anf.rkt" 6 | redex/gui) 7 | 8 | 9 | ; Bug: not confluent, so doesn't normalize 10 | ;(stepper aot-normalize (term (let ([x (let ([y true]) y)]) x))) 11 | 12 | ;(stepper anf->+ (term (AS (let ([x (let ([y true]) y)]) x)))) 13 | 14 | (stepper aot-normalize (term (+ (if (let ([x #t]) x) 6 7) 1))) 15 | 16 | ;; Run to completion 17 | ;(stepper anf->+ (term (AS (let ([x (let ([y true]) y)]) x)))) 18 | (stepper anf->+ (term (AS (+ (if (let ([x #t]) x) 6 7) 1)))) 19 | 20 | ;; Bug, causes infinite loop. Must be missing a termination condition. 21 | #;(stepper aot-normalize 22 | (term 23 | (letrec ([fact (λ (n) 24 | (if (eq? n 0) 25 | 1 26 | (* n (fact (- n 1)))))]) 27 | (fact 5)))) 28 | -------------------------------------------------------------------------------- /compose.rkt: -------------------------------------------------------------------------------- 1 | #lang racket/base 2 | 3 | (require 4 | redex/reduction-semantics 5 | "lambda-s.rkt" 6 | "anf.rkt" 7 | "cc.rkt" 8 | "hoist.rkt" 9 | "specify.rkt" 10 | "code-gen.rkt" 11 | "al.rkt") 12 | 13 | ;(define-union-language CompileL ANFL CCL) 14 | ; 15 | ;(define anf-compile-> (extend-reduction-relation anf-> CompileL)) 16 | ;(define cc-compile-> (extend-reduction-relation cc-> CompileL)) 17 | ; 18 | ;(define compile-> (union-reduction-relations anf-compile-> cc-compile->)) 19 | ; 20 | ;(define anf-compile->+ (extend-reduction-relation anf->+ CompileL)) 21 | ;(define cc-compile->+ (extend-reduction-relation cc->+ CompileL)) 22 | ; 23 | ;(define compile->+ (union-reduction-relations anf-compile->+ cc-compile->+)) 24 | 25 | (current-cache-all? #t) 26 | (define (compile t) 27 | (for/fold ([t t]) 28 | ([pass (list anf->+ cc->+ h-> s->+ cg->+ #;flatten->+)]) 29 | (car (apply-reduction-relation* pass t)))) 30 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | all: /tmp/multi-lang-comp/latex/prisc2021-abstract.tex /tmp/multi-lang-comp/prisc2021-abstract.pdf /tmp/multi-lang-comp/sources.zip 2 | 3 | /tmp/multi-lang-comp/prisc2021-abstract.pdf: prisc2021-abstract.scrbl prisc-extra.tex wilbowma.bib 4 | scribble --dest /tmp/multi-lang-comp/ ++style prisc-extra.tex --pdf $< 5 | 6 | /tmp/multi-lang-comp/latex/prisc2021-abstract.tex: prisc2021-abstract.scrbl prisc-extra.tex wilbowma.bib Makefile 7 | scribble --dest /tmp/multi-lang-comp/latex/ ++style prisc-extra.tex --latex $< 8 | # fix up some scribble bugs 9 | sed -i "" -e s/_ACMon/_ACM_on/g -e 's/ ACMon/ ACM on/g' $@ 10 | sed -i "" -e s/_CPSTranslation/_CPS_Translation/g -e 's/ CPSTranslation/ CPS Translation/g' $@ 11 | 12 | /tmp/multi-lang-comp/sources.zip: /tmp/multi-lang-comp/latex/prisc2021-abstract.tex prisc2021-abstract.scrbl prisc-extra.tex wilbowma.bib 13 | rm -f $@ 14 | cd /tmp/multi-lang-comp/latex/ ; zip $@ ./* 15 | 16 | wilbowma.bib: ~/iCloud/wilbowma.bib 17 | cp $< $@ 18 | -------------------------------------------------------------------------------- /bib.rkt: -------------------------------------------------------------------------------- 1 | #lang racket/base 2 | 3 | (require 4 | scriblib/autobib 5 | scriblib/bibtex 6 | scribble/html-properties 7 | scribble/latex-properties 8 | setup/main-collects 9 | scribble/core) 10 | 11 | (provide 12 | (all-defined-out)) 13 | 14 | (define-cite acite acitet generate-bibliography) 15 | (define-bibtex-cite* "wilbowma.bib" acite acitet ~cite citet) 16 | 17 | #| 18 | ;; fixup issue with autobib when bibliography not generated 19 | (define autobib-style-extras 20 | (let ([abs (lambda (s) 21 | (path->main-collects-relative 22 | (collection-file-path s "scriblib")))]) 23 | (list 24 | (make-css-addition (abs "autobib.css")) 25 | (make-tex-addition (abs "autobib.tex"))))) 26 | 27 | (define-syntax-rule (~cite any ...) 28 | (make-element 29 | (make-style #f autobib-style-extras) 30 | (a~cite any ...))) 31 | 32 | (define-syntax-rule (citet any ...) 33 | (make-element 34 | (make-style #f autobib-style-extras) 35 | (acitet any ...))) 36 | |# 37 | -------------------------------------------------------------------------------- /paper.scrbl: -------------------------------------------------------------------------------- 1 | #lang scribble/acmart @acmsmall @nonacm @screen 2 | 3 | @title{Compilation as Multi-Language Semantics} 4 | @(require "bib.rkt") 5 | 6 | @author[ 7 | #:orcid "0000-0002-6402-4840" 8 | #:affiliation "University of British Columbia, CA" 9 | #:email (email "wjb@williamjbowman.com") 10 | ]{William J. Bowman} 11 | 12 | @abstract{ 13 | We present a compiler from a Scheme-like language to an x86-64-like assembly 14 | language. 15 | The compiler performs the A-normal-form translation, closure conversion, heap 16 | allocation and representation specification, hoisting, and code generation. 17 | 18 | The novelity in the design is that the compiler is not a translation between 19 | languages, but a reduction relation in a multi-language semantics. 20 | Formalizing the compiler as a multi-language semantics provides interesting 21 | semantic insights, presentation benefits, and verification benefit. 22 | Normalization of the cross-language redexes performs ahead-of-time (AOT) 23 | compilation. 24 | Evaluation in the multi-language models just-in-time (JIT) compilation. 25 | Confluence of multi-language reduction implies compiler correctness. 26 | The reduction systems compose easily, enabling simple veritcal composition of 27 | separate passes. 28 | Horizontal composition (linking) is enabled easily be embedding in the 29 | multi-language. 30 | } 31 | 32 | @include-section{source.scrbl} 33 | @include-section{a-normal-form.scrbl} 34 | @(generate-bibliography) 35 | -------------------------------------------------------------------------------- /al.rkt: -------------------------------------------------------------------------------- 1 | #lang racket/base 2 | (require 3 | redex/reduction-semantics) 4 | 5 | (provide (all-defined-out)) 6 | 7 | ; aL is an assembly language. 8 | ; It has infinite registers, an abstract notion of tagged word-size data and 9 | ; pointers, a primitive memory operations, push and pop onto an abstract stack. 10 | (define-language aL 11 | [n ::= w (alloc w tag) (+ r w) (* r w) (- r w) (to-word w) (mref r w)] 12 | [e ::= 13 | (begin e ...) 14 | (with-label label e) 15 | s] 16 | [s ::= 17 | (with-label label s) 18 | (set! r n) 19 | (push w) 20 | (pop r) 21 | (mset! r w w) 22 | (jump-if (flag r w) label) 23 | (jump-if (tag-eq? r tag) label) 24 | (jump w)] 25 | [w ::= (immediate number tag) r (word number) label] 26 | [r ::= (variable-prefix r)] 27 | [label ::= variable-not-otherwise-mentioned] 28 | [flag ::= eq? <] 29 | [tag ::= 'bool 'pair 'box 'void 'empty 'fixnum 'procedure] 30 | 31 | ; Spec 32 | #;[C ::= (compatible-closure-context e)] 33 | ; Speed 34 | [C ::= hole (begin s ... hole e ...)]) 35 | 36 | (current-cache-all? #t) 37 | 38 | (define flatten-> 39 | (reduction-relation 40 | aL 41 | 42 | #:domain e 43 | #:codomain e 44 | 45 | (--> 46 | (begin e ... (begin e_n ...) e_r ...) 47 | (begin e ... e_n ... e_r ...)) 48 | 49 | (--> 50 | (with-label label (begin e e_r ...)) 51 | (begin (with-label label e) e_r ...)))) 52 | 53 | (define flatten->+ (context-closure flatten-> aL C)) 54 | -------------------------------------------------------------------------------- /anf-testing-confluence.rkt: -------------------------------------------------------------------------------- 1 | #lang racket/base 2 | 3 | (require 4 | redex/reduction-semantics 5 | "anf.rkt" 6 | redex/gui) 7 | 8 | ;; Confluence fixed! 9 | ;(stepper anf->+ (term (AS (let ([x (let ([y true]) y)]) x)))) 10 | 11 | ;(stepper anf->+ (term (AS (let ([x true]) x)))) 12 | ;(stepper anf->+ (term (AS (let ([x (+ 1 2)]) x)))) 13 | 14 | ;(stepper anf->+ (term (AS (let ([x (SA (AS (let ([y true]) y)))]) x)))) 15 | 16 | ;(stepper anf->+ (term (AS (let ([x (begin (set! y true) y)]) x)))) 17 | 18 | ;\not-> 19 | 20 | ;; invalid, since merge-l only applies to a term without an empty evaluation 21 | ;; context; we're forced to focus on the begin in evaluation position, first. 22 | (stepper anf->+ (term (AS (SA (let ([x (AS (begin (set! y true) y))]) x))))) 23 | 24 | ;(stepper anf->+ (term (AS (+ (if (let ([x #t]) x) 6 7) 1)))) 25 | 26 | ;; Is this a valid T[p]? No; translation context cannot have any boundaries, 27 | ;; except the AS boundary. Definition of C (A.Cv) doesn't have any boundary 28 | ;; terms with a context or hole under them. 29 | 30 | ; (SA (let ([x (AS (SA (AS e1)))]) 31 | ; (AS e2))) 32 | 33 | #| 34 | (in-hole (SA (let ([x e]) (AS hole))) 35 | e2) 36 | 37 | C = (SA (let ([x e]) hole)) ; <- invalid C 38 | T = (AS hole) 39 | p = e2 40 | 41 | (in-hole (SA (let ([x e]) (AS hole))) 42 | e2) 43 | 44 | (let ([x (AS (SA (AS e1)))]) 45 | (AS e2)) 46 | 47 | C = (let ([x e]) hole) ; <- valid C 48 | T = (AS hole) 49 | p = e2 50 | 51 | 52 | C = (let ([x hole]) (AS hole)) ; <- valid C 53 | T = (AS hole) 54 | p = (SA (AS e1)) 55 | 56 | |# 57 | -------------------------------------------------------------------------------- /lambda-h.rkt: -------------------------------------------------------------------------------- 1 | #lang racket/base 2 | (require 3 | redex/reduction-semantics) 4 | 5 | (provide (all-defined-out)) 6 | 7 | ; λhL is the hoisted λ-calculus. 8 | (define-language λhL 9 | [p ::= (with-labels ([x (λ (x ...) e)] ...) 10 | p) 11 | e] 12 | [e ::= (cletrec ([x (closure x e ...)] 13 | ...) 14 | e) 15 | (apply-closure e e ...) 16 | ; should have more primitive apply to enable closure optimizations. 17 | x 18 | (begin e ... e) (box e) (unbox e) 19 | (set-box! e e) (cons e e) (car e) (cdr e) fixnum (+ e e) (- e e) (* e e) 20 | (let ([x e] ...) e) 21 | (void) '() 22 | (if e e e) 23 | (eq? e e) 24 | (pair? e) 25 | (fixnum? e) 26 | (boolean? e) 27 | (procedure? e) 28 | (box? e) 29 | (void? e) 30 | (< e e) 31 | (closure-ref e e) 32 | boolean] 33 | [x ::= variable-not-otherwise-mentioned] 34 | [fixnum ::= number] ; TODO restrict to fixnum range 35 | #:binding-forms 36 | (λ (x ...) e #:refers-to (shadow x ...)) 37 | 38 | (with-labels ([x any_1] ...) #:refers-to (shadow x ...) 39 | e_2 #:refers-to (shadow x ...)) 40 | 41 | (cletrec ([x any_1] ...) #:refers-to (shadow x ...) 42 | e_2 #:refers-to (shadow x ...)) 43 | 44 | (let ([x e_1] ...) e_2 #:refers-to (shadow x ...))) 45 | 46 | (define-term h-eg 47 | (with-labels 48 | ((factL 49 | (λ (c n) 50 | (let ((fact (closure-ref c 0))) 51 | (if (eq? n 0) 1 (* n (apply-closure fact fact (- n 1))))))) 52 | (even?L 53 | (λ (c n) 54 | (let ((odd? (closure-ref c 0))) 55 | (if (eq? n 0) #t (apply-closure odd? odd? (- n 1)))))) 56 | (odd?L 57 | (λ (c n) 58 | (let ((even? (closure-ref c 0))) 59 | (if (eq? n 0) #f (apply-closure even? even? (- n 1)))))) 60 | (lengthL 61 | (λ (c x) 62 | (let ((length (closure-ref c 0))) 63 | (cletrec 64 | ((empty? (closure empty?L))) 65 | (if (pair? x) 66 | (if (apply-closure empty? empty? x) 67 | 0 68 | (+ 1 (apply-closure length length (cdr x)))) 69 | -1))))) 70 | (empty?L (λ (c x) (let () (eq? x '()))))) 71 | (let ((x (box 0))) 72 | (cletrec 73 | ((fact (closure factL fact))) 74 | (cletrec 75 | ((even? (closure even?L odd?)) (odd? (closure odd?L even?))) 76 | (begin 77 | (if (apply-closure even? even? (apply-closure fact fact 5)) 78 | (cletrec 79 | ((length (closure lengthL length))) 80 | (set-box! x (apply-closure length length (cons 1 '())))) 81 | (set-box! x (cons 2 '()))) 82 | (unbox x))))))) 83 | -------------------------------------------------------------------------------- /lfcs-2021-seminar-talk/abstract.txt: -------------------------------------------------------------------------------- 1 | Title: Compilation as Multi-Language Semantics 2 | 3 | Abstract: 4 | 5 | Modeling interoperability between programs in different languages is a key 6 | problem when modeling verified and secure compilation, which has been 7 | successfully addressed using multi-language semantics. 8 | Unfortunately, existing models of compilation using multi-language semantics 9 | define two variants of each compiler pass: a syntactic translation 10 | on open terms to model compilation, and a run-time translation of closed terms 11 | at multi-language boundaries to model interoperability. 12 | 13 | In this talk, I discuss work-in-progress approach to uniformly model a compiler 14 | entirely as a reduction system on open term in a multi-language semantics, 15 | rather than as a syntactic translation. 16 | This simultaneously defines the compiler and the interoperability semantics, 17 | reducing duplication. 18 | It also provides interesting semantic insights. 19 | Normalization of the cross-language redexes performs ahead-of-time (AOT) 20 | compilation. 21 | Evaluation in the multi-language models just-in-time (JIT) compilation. 22 | Confluence of multi-language reduction implies compiler correctness, and 23 | part of the secure compilation proof (full abstraction), enabling focus on the difficult part of the proof. 24 | Subject reduction of the multi-language reduction implies type-preservation of 25 | the compiler. 26 | 27 | Bio: 28 | 29 | William J. Bowman is an Assistant Professor of computer science in the Software 30 | Practices Lab at University of British Columbia. 31 | Broadly speaking, he is interested in making it easier for programmers to communicate their intent to 32 | machines, and preserving that intent through compilation. 33 | More specifically, his research interests include secure and verified compilation, dependently typed 34 | programming, verification, meta-programming, and interoperability. 35 | His recent work examines type-preserving compilation of dependently typed programming languages like 36 | Coq, a technique that can enable preserving security and correctness invariants of verified software 37 | through compilation and statically enforcing those invariants in the low-level (assembly-like) code 38 | generated by compilers. 39 | 40 | 41 | Twitter Abstract: 42 | 43 | Modeling interoperability between programs in different languages is a key 44 | problem when modeling verified and secure compilation. 45 | In this talk, I discuss a work-in-progress approach to uniformly model a compiler 46 | entirely as a reduction system on open term in a multi-language semantics, 47 | rather than as a syntactic translation. 48 | This simultaneously defines the compiler and the interoperability semantics. 49 | It also provides interesting semantic insights, connecting common theorems of reductions systems to common compiler correctness theorems, and allowing us to derive JIT and AOT versions of the compiler. 50 | -------------------------------------------------------------------------------- /base.rkt: -------------------------------------------------------------------------------- 1 | #lang racket/base 2 | 3 | (require 4 | redex/reduction-semantics 5 | racket/dict) 6 | 7 | (provide (all-defined-out)) 8 | 9 | (define-language baseL 10 | [fixnum ::= integer] 11 | [arith-op ::= + - * / <] 12 | [binop ::= arith-op eq?] 13 | [tag-pred ::= pair? fixnum? boolean?] 14 | [env ::= any] ; must be a dict 15 | [x ::= variable-not-otherwise-mentioned] 16 | 17 | [l ::= (variable-prefix lb)] 18 | 19 | [primop ::= void binop tag-pred pair first second box unbox set-box!] 20 | ) 21 | 22 | (require racket/syntax) 23 | (define (fresh-prefixed-variable-maker prefix) 24 | (let ([x (box 0)]) 25 | (lambda ([name ""]) 26 | (set-box! x (add1 (unbox x))) 27 | (format-symbol "~a~a~a" prefix name (unbox x))))) 28 | 29 | (define-metafunction baseL 30 | fresh-labels : x ... -> (l ...) 31 | [(fresh-labels x ...) 32 | ,(map fresh-label (term (x ...)))]) 33 | 34 | (define fresh-label (fresh-prefixed-variable-maker 'lb)) 35 | 36 | (define (int61? x) (<= (min-int 61) x (max-int 61))) 37 | (define (max-int word-size) (sub1 (expt 2 (sub1 word-size)))) 38 | (define (min-int word-size) (* -1 (expt 2 (sub1 word-size)))) 39 | 40 | (define-metafunction baseL 41 | [(subst-all any () ()) any] 42 | [(subst-all any (x_1 x ...) (any_1 any_more ...)) 43 | (subst-all (substitute any x_1 any_1) (x ...) (any_more ...))]) 44 | 45 | (define-metafunction baseL 46 | env-extend : env (l any) ... -> env 47 | [(env-extend env (l any) ...) 48 | ,(for/fold ([d (term env)]) 49 | ([k (term (l ...))] 50 | [v (term (any ...))]) 51 | (dict-set d k v))]) 52 | 53 | (define-metafunction baseL 54 | env-ref : env l -> any 55 | [(env-ref env l) 56 | ,(dict-ref (term env) (term l))]) 57 | 58 | (define (fixnum-error? v) 59 | (not (redex-match? baseL fixnum v))) 60 | 61 | (define-metafunction baseL 62 | non-fixnum? : any -> boolean 63 | [(non-fixnum? any) 64 | ,(fixnum-error? (term any))]) 65 | 66 | ; For some reason, eval wouldn't work. *shrug*. 67 | (define (arith-op->proc v) 68 | (case v 69 | [(-) -] 70 | [(+) +] 71 | [(*) *] 72 | [(/) /] 73 | [(<) <])) 74 | 75 | (define-metafunction baseL 76 | denote : arith-op fixnum ... -> fixnum 77 | [(denote arith-op fixnum ...) 78 | ,(apply (arith-op->proc (term arith-op)) (term (fixnum ...)))]) 79 | 80 | (define (boolean-error? v) 81 | (not (redex-match? baseL boolean v))) 82 | 83 | (define-metafunction baseL 84 | non-boolean? : any -> boolean 85 | [(non-boolean? any) 86 | ,(boolean-error? (term any))]) 87 | 88 | 89 | (define-metafunction baseL 90 | non-false? : any -> boolean 91 | [(non-false? #f) 92 | #f] 93 | [(non-false? any) 94 | #t]) 95 | 96 | (define-metafunction baseL 97 | arity-error : (any ...) (any ...) -> boolean 98 | [(arity-error (any_1 ..._1) (any_2 ..._1)) 99 | #f] 100 | [(arity-error any_1 any_2) 101 | #t]) 102 | 103 | 104 | (define-metafunction baseL 105 | non-fv? : any -> boolean 106 | [(non-fv? (λ (any_1 ...) any_2)) 107 | #f] 108 | [(non-fv? any) 109 | #t]) 110 | 111 | (define-metafunction baseL 112 | [(not-equal? any any) #f] 113 | [(not-equal? any_!_1 any_!_1) #t]) 114 | -------------------------------------------------------------------------------- /code-gen.rkt: -------------------------------------------------------------------------------- 1 | #lang racket/base 2 | 3 | (require 4 | redex/reduction-semantics 5 | "lambda-b.rkt" 6 | "al.rkt") 7 | 8 | (provide (all-defined-out)) 9 | 10 | ; Design pattern for a multi-language with syntactic distinction between source 11 | ; and target, but also a combined syntax. 12 | (define-union-language tagCGL (S. λbL) (T. aL)) 13 | (define-union-language mergeCGL λbL aL) 14 | (define-union-language preCGL mergeCGL tagCGL) 15 | 16 | (define-extended-language CGL preCGL 17 | #;[e ::= .... p] 18 | [T ::= hole 19 | (begin T.e ... T e ...) 20 | (with-label label T)] 21 | [C ::= T]) 22 | 23 | (define-metafunction CGL 24 | [(subst-all () () any) any] 25 | [(subst-all (r_1 r ...) (x_1 x ...) any) 26 | (subst-all (r ...) (x ...) (substitute any x_1 r_1))]) 27 | 28 | (define cg-> 29 | (reduction-relation 30 | CGL 31 | ; #:domain e 32 | ; #:codomain e 33 | 34 | (--> (with-labels ([x_f (λ (x_a ...) e)] ...) 35 | e_2) 36 | (begin 37 | (with-label main 38 | (begin 39 | (push halt) 40 | e_2)) 41 | (with-label x_f 42 | (begin 43 | (pop r_a) ... 44 | (subst-all (r_a ...) (x_a ...) e) 45 | #;(let ([rx ]) 46 | (begin 47 | (pop r) 48 | (push rx) 49 | (jump r))))) 50 | ...) 51 | #;(fresh ((r_a ...) ((x_a ...) ...))) 52 | (where ((r_a ...) ...) 53 | ,(for/list ([ls (term ((x_a ...) ...))]) 54 | (for/list ([_ ls]) 55 | (gensym 'ra)))) 56 | #;(fresh ((rx ...) (x_f ...))) 57 | #;(fresh ((r ...) (x_f ...))) 58 | (fresh main)) 59 | (--> (let ([x (call e_l e ...)]) 60 | e_2) 61 | (begin 62 | (push e) ... 63 | (push label_r) 64 | (set! r e_l) 65 | (jump r) 66 | (with-label label_r 67 | (begin 68 | (pop r) 69 | (substitute e_2 x r)))) 70 | (fresh r)) 71 | (--> (let ([x n] ...) 72 | e_2) 73 | (subst-all 74 | (r ...) 75 | (x ...) 76 | (begin 77 | (set! r n) ... 78 | e_2)) 79 | (fresh ((r ...) (x ...)))) 80 | ; Boolean prim 81 | (--> (let ([x (compare (flag e_1 e_2) e_t e_f)]) 82 | e) 83 | (substitute 84 | (begin 85 | (compare (flag e_1 e_2) 86 | (begin 87 | (set! r e_t)) 88 | (begin 89 | (set! r e_f))) 90 | e) 91 | x 92 | r) 93 | (fresh r)) 94 | (--> 95 | (compare (flag e_1 e_2) e_t e_f) 96 | (begin 97 | (jump-if (flag e_1 e_2) label_1) 98 | (jump label_2) 99 | (with-label label_1 e_t) 100 | (with-label label_2 e_f)) 101 | (fresh label_1) 102 | (fresh label_2)) 103 | (--> 104 | (call e_l e ...) 105 | (begin 106 | (pop r) 107 | (push e) ... 108 | (push r) 109 | (set! rl e_l) 110 | (jump rl)) 111 | (fresh r) 112 | (fresh rl)) 113 | (--> n 114 | (begin 115 | (pop r) 116 | (set! rv n) 117 | (push rv) 118 | (jump r)) 119 | (fresh r) 120 | (fresh rv)))) 121 | 122 | (define cg->+ (context-closure cg-> CGL C)) 123 | 124 | (current-cache-all? #t) 125 | 126 | #;(module+ test 127 | (parameterize ([default-language SL]) 128 | (test-->> 129 | cg->+ 130 | #:equiv alpha-equivalent? 131 | (term b-eg) 132 | ))) 133 | -------------------------------------------------------------------------------- /lambda-cc.rkt: -------------------------------------------------------------------------------- 1 | #lang racket/base 2 | (require 3 | redex/reduction-semantics) 4 | 5 | (provide (all-defined-out)) 6 | 7 | ; λccL is the closure-converted λ-calculus. 8 | (define-language λccL 9 | [e ::= (pletrec ([x (λ (x ...) e)] ...) 10 | e) 11 | (cletrec ([x (closure x e ...)] 12 | ...) 13 | e) 14 | ; Need distinct form syntax for applying closure to avoid infinite loop in 15 | ; normalization. 16 | ; Could also do type-directed, if types were different. 17 | ; Or.. maybe language tag of some kind 18 | ; ((source #%app) e_1 e ...) -> ((target #%app) e_1 e_1 e ...) 19 | (apply-closure e e ...) 20 | ; should have more primitive apply to enable closure optimizations. 21 | x 22 | (begin e ... e) (box e) (unbox e) 23 | (set-box! e e) (cons e e) (car e) (cdr e) fixnum (+ e e) (- e e) (* e e) 24 | (let ([x e] ...) e) 25 | (void) '() 26 | (if e e e) 27 | (eq? e e) 28 | (pair? e) 29 | (fixnum? e) 30 | (boolean? e) 31 | (procedure? e) 32 | (box? e) 33 | (void? e) 34 | (< e e) 35 | (closure-ref e e) 36 | boolean] 37 | [x ::= variable-not-otherwise-mentioned] 38 | [fixnum ::= number] ; TODO restrict to fixnum range 39 | #:binding-forms 40 | (λ (x ...) e #:refers-to (shadow x ...)) 41 | 42 | (pletrec ([x any_1] ...) #:refers-to (shadow x ...) 43 | e_2 #:refers-to (shadow x ...)) 44 | 45 | (cletrec ([x any_1] ...) #:refers-to (shadow x ...) 46 | e_2 #:refers-to (shadow x ...)) 47 | 48 | (let ([x e_1] ...) e_2 #:refers-to (shadow x ...))) 49 | 50 | 51 | (define-term cc-eg 52 | (let ((x (box 0))) 53 | (pletrec 54 | ((factL 55 | (λ (c n) 56 | (let ((fact (closure-ref c 0))) 57 | (if (eq? n 0) 58 | 1 59 | (* n (apply-closure fact fact (- n 1)))))))) 60 | (cletrec 61 | ((fact (closure factL fact))) 62 | (pletrec 63 | ((even?L 64 | (λ (c n) 65 | (let ((odd? (closure-ref c 0))) 66 | (if (eq? n 0) 67 | #t 68 | (apply-closure odd? odd? (- n 1)))))) 69 | (odd?L 70 | (λ (c n) 71 | (let ((even? (closure-ref c 0))) 72 | (if (eq? n 0) 73 | #f 74 | (apply-closure even? even? (- n 1))))))) 75 | (cletrec 76 | ((even? (closure even?L odd?)) 77 | (odd? (closure odd?L even?))) 78 | (begin 79 | (if (apply-closure 80 | even? 81 | even? 82 | (apply-closure fact fact 5)) 83 | (pletrec 84 | ((lengthL 85 | (λ (c x) 86 | (let ((length (closure-ref c 0))) 87 | (pletrec 88 | ((empty?L (λ (c x) (let () (eq? x '()))))) 89 | (cletrec 90 | ((empty? (closure empty?L))) 91 | (if (pair? x) 92 | (if (apply-closure empty? empty? x) 93 | 0 94 | (+ 95 | 1 96 | (apply-closure 97 | length 98 | length 99 | (cdr x)))) 100 | -1))))))) 101 | (cletrec 102 | ((length (closure lengthL length))) 103 | (set-box! 104 | x 105 | (apply-closure length length (cons 1 '()))))) 106 | (set-box! x (cons 2 '()))) 107 | (unbox x)))))))) 108 | -------------------------------------------------------------------------------- /hoist.rkt: -------------------------------------------------------------------------------- 1 | #lang racket/base 2 | 3 | (require 4 | redex/reduction-semantics 5 | "lambda-cc.rkt" 6 | "lambda-h.rkt") 7 | 8 | (provide (all-defined-out)) 9 | 10 | ; Design pattern for a multi-language with syntactic distinction between source 11 | ; and target, but also a combined syntax. 12 | (define-union-language tagHL (S. λccL) (T. λhL)) 13 | (define-union-language mergeHL λccL λhL) 14 | (define-union-language preHL mergeHL tagHL) 15 | 16 | (define-extended-language HL preHL 17 | [p ::= .... e] 18 | [T ::= hole 19 | (with-labels ([T.x (λ (T.x ...) T.e)] 20 | ... 21 | [T.x (λ (T.x ...) T)] 22 | [x (λ (x ...) e)] 23 | ...) 24 | p) 25 | (with-labels ([T.x (λ (T.x ...) T.e)] 26 | ...) 27 | T) 28 | (cletrec ([T.x (closure T.e ...)] 29 | ... 30 | [T.x (closure T.e ... T e ...)] 31 | [x (closure e ...)] ...) 32 | e) 33 | (cletrec ([T.x (closure T.e ...)] ...) 34 | T) 35 | (apply-closure T.e ... T e ...) 36 | (kw T.e ... T e ...) 37 | (let ([T.x_1 T.e] 38 | ... 39 | [x_i T] 40 | [x_n e] ...) 41 | e) 42 | (let ([T.x_1 T.e] ...) 43 | T)] 44 | [kw ::= begin void eq? pair? fixnum? boolean? procedure? box? void? < + - * 45 | cons car cdr box unbox set-box! if] 46 | 47 | ; Spec, but slow 48 | #;[C ::= (compatible-closure-context p #:wrt e)] 49 | [C ::= T]) 50 | 51 | (define h-> 52 | (reduction-relation 53 | HL 54 | #:domain p 55 | #:codomain p 56 | 57 | (--> (in-hole C (pletrec ([x any_1] ...) e)) 58 | (with-labels ([x any_1] ...) 59 | (in-hole C e))) 60 | 61 | (--> (with-labels ([x_1 any_1] ...) 62 | (with-labels ([x_2 any_2] ...) 63 | p)) 64 | (with-labels ([x_2 any_2] 65 | ... 66 | [x_1 any_1] 67 | ...) 68 | p)))) 69 | 70 | #;(define h->+ (context-closure h-> HL T)) 71 | 72 | (parameterize ([default-language HL]) 73 | (test-->> 74 | h-> 75 | #:equiv alpha-equivalent? 76 | (term cc-eg) 77 | (term 78 | (with-labels 79 | ((factL 80 | (λ (c n) 81 | (let ((fact (closure-ref c 0))) 82 | (if (eq? n 0) 1 (* n (apply-closure fact fact (- n 1))))))) 83 | (even?L 84 | (λ (c n) 85 | (let ((odd? (closure-ref c 0))) 86 | (if (eq? n 0) #t (apply-closure odd? odd? (- n 1)))))) 87 | (odd?L 88 | (λ (c n) 89 | (let ((even? (closure-ref c 0))) 90 | (if (eq? n 0) #f (apply-closure even? even? (- n 1)))))) 91 | (lengthL 92 | (λ (c x) 93 | (let ((length (closure-ref c 0))) 94 | (cletrec 95 | ((empty? (closure empty?L))) 96 | (if (pair? x) 97 | (if (apply-closure empty? empty? x) 98 | 0 99 | (+ 1 (apply-closure length length (cdr x)))) 100 | -1))))) 101 | (empty?L (λ (c x) (let () (eq? x '()))))) 102 | (let ((x (box 0))) 103 | (cletrec 104 | ((fact (closure factL fact))) 105 | (cletrec 106 | ((even? (closure even?L odd?)) (odd? (closure odd?L even?))) 107 | (begin 108 | (if (apply-closure even? even? (apply-closure fact fact 5)) 109 | (cletrec 110 | ((length (closure lengthL length))) 111 | (set-box! x (apply-closure length length (cons 1 '())))) 112 | (set-box! x (cons 2 '()))) 113 | (unbox x))))))))) 114 | -------------------------------------------------------------------------------- /cc2.rkt: -------------------------------------------------------------------------------- 1 | #lang racket/base 2 | (require racket/set) 3 | 4 | (require 5 | redex/reduction-semantics 6 | "lambda-s.rkt" 7 | "lambda-cc.rkt") 8 | 9 | (define-union-language preCCL (S. λiL) (T. λccL)) 10 | 11 | (define-extended-language CCL preCCL 12 | [S.e ::= .... (ST T.e)] 13 | [T.e ::= .... (TS S.e)] 14 | [TC ::= (compatible-closure-context T.e)] 15 | [e ::= S.e T.e] 16 | [T ::= hole 17 | (pletrec ([T.x (λ (T.x ...) T.e)] 18 | ... 19 | [T.x (λ (T.x ...) T)] 20 | [x (λ (x ...) e)] 21 | ...) 22 | e) 23 | (pletrec ([T.x (λ (T.x ...) T.e)] ...) 24 | T) 25 | (cletrec ([T.x (closure T.e ...)] 26 | ... 27 | [T.x (closure T.e ... T e ...)] 28 | [x (closure e ...)] ...) 29 | e) 30 | (cletrec ([T.x (closure T.e ...)] ...) 31 | T) 32 | (apply-closure T.e ... T e ...) 33 | (kw T.e ... T e ...) 34 | (let ([T.x_1 T.e] 35 | ... 36 | [x_i T] 37 | [x_n e] ...) 38 | e) 39 | (let ([T.x T.e] ...) 40 | T)] 41 | [kw ::= begin void eq? pair? fixnum? boolean? procedure? box? void? < + - * cons car cdr box unbox set-box! if] 42 | [terminal ::= number boolean x] 43 | [x ::= S.x T.x]) 44 | 45 | (define-metafunction CCL 46 | free-vars : any -> (x ...) 47 | [(free-vars x) (x)] 48 | [(free-vars (λ (x ...) e)) 49 | (free-vars ((substitute e x 0) ...))] 50 | [(free-vars (letrec ([x_1 any_1] ...) e)) 51 | (free-vars ((substitute e x_1 0) 52 | ... 53 | (substitute (any_1 ...) x_1 0) 54 | ...))] 55 | [(free-vars (pletrec ([x_1 any_1] ...) e)) 56 | (free-vars ((substitute e x_1 0) 57 | ... 58 | (substitute (any_1 ...) x_1 0) 59 | ...))] 60 | [(free-vars (cletrec ([x_1 any_1] ...) e)) 61 | (free-vars ((substitute e x_1 0) 62 | ... 63 | (substitute (any_1 ...) x_1 0) 64 | ...))] 65 | [(free-vars (let ([x_1 e_1] ...) e_2)) 66 | (free-vars ((substitute e_2 x_1 0) 67 | ... 68 | e_1 ...))] 69 | [(free-vars (any ...)) 70 | ,(set-union '() (term (x ... ...))) 71 | (where ((x ...) ...) ((free-vars any) ...))] 72 | [(free-vars any) ()]) 73 | 74 | (define cc-> 75 | (reduction-relation 76 | CCL 77 | #:domain T.e 78 | #:codomain T.e 79 | 80 | (--> 81 | (TS (letrec ([x_f (λ (x ...) e_1)] ...) 82 | e_2)) 83 | (pletrec ([x_fl (λ (x_c x ...) 84 | (let ([x_f0 (closure-ref x_cc e_i)] 85 | ...) 86 | (TS e_1)))] 87 | ...) 88 | (cletrec ([x_f (closure x_fl x_f0 ...)] 89 | ...) 90 | (TS e_2))) 91 | (fresh ((x_fl ...) (x_f ...))) 92 | (fresh ((x_c ...) (x_f ...))) 93 | (where ((x_f0 ...) ...) ((free-vars (λ (x ...) e_1)) ...)) 94 | (where ((x_cc ...) ...) 95 | ,(for/list ([ls (term ((x_f0 ...) ...))] 96 | [x_c (term (x_c ...))]) 97 | (for/list ([_ ls]) 98 | x_c))) 99 | (where ((e_i ...) ...) 100 | ,(map (lambda (x) (build-list (length x) values)) (term ((x_f0 ...) ...))))) 101 | 102 | (--> (TS (e_1 e ...)) (apply-closure (TS e_1) (TS e_1) (TS e) ...)) 103 | 104 | (--> (TS T.e) T.e) 105 | (--> (TS (ST e)) e) 106 | (--> (TS (in-hole T S.e)) (TS (in-hole T (ST (TS S.e)))) 107 | (where (any_!_1 any_!_1) (hole T))))) 108 | 109 | (define cc->+ (context-closure cc-> CCL T)) 110 | 111 | (current-cache-all? #t) 112 | 113 | #;(parameterize ([default-language CCL]) 114 | (test-->> 115 | cc->+ 116 | #:equiv alpha-equivalent? 117 | (term 118 | (TS 119 | (letrec ([fact (λ (n) 120 | (if (eq? n 0) 121 | 1 122 | (* n (fact (- n 1)))))]) 123 | (fact 5)))) 124 | (term 125 | (pletrec ([fact-label 126 | (λ (c x) 127 | (let ([fact (closure-ref c 0)]) 128 | (if (eq? x 0) 129 | 1 130 | (* x (apply-closure fact fact (- x 1))))))]) 131 | (cletrec ([fact (closure fact-label fact)]) 132 | (apply-closure fact fact 5)))))) 133 | 134 | -------------------------------------------------------------------------------- /notes.md: -------------------------------------------------------------------------------- 1 | - Compilers are normalizing reductions in multi-language rewrite systems? 2 | - Could be used to model semantics of JIT, macro systems, and languages with embedded assembly. 3 | Multi-language semantics have already been used this way, but usually have 4 | quite rigid restricts at the language boundaries. 5 | For example, (one of Amal's paper) allows a closure-converted language to 6 | interroperate with a functional source language, but it assumes all functions 7 | are closed during reduction. This semantics forbids embedding "open" functions 8 | (implicit closures) (TODO check that). 9 | Instead, treating the entire compiler as the multi-language reduction system 10 | provides a unified semantics for both execution and translation between languages. 11 | - seems related to meta-circular interpreters, but with a fixed hierarchy. Can 12 | escape up or down the tower, but not extend the tower. 13 | - expressed this way, inherently compositional. guarantees if correct then correct w.r.t. separate compilation. 14 | (w.r.t. compositional correctness.. harder to say. secure compilation? depends 15 | on ability to enforce boundaries... certainly not guaranteed) 16 | - correctness amounts to Church-Rosser in the multi-language. 17 | - Requires careful choice of syntax. Want to each syntax to correspond to reduction in 1 language. 18 | Multi-language boundaries could help here, but a bit tedious to write 19 | translation once explicit boundary terms get involved. 20 | Can no longer rely on compatible-closure to specify translation, but need 21 | administrative rules for managing boundary terms. 22 | - The nanopass library and Lisp-style macro systems are close to an 23 | implementation technique. Each allows expressing transformations on syntax as 24 | local rewrites, and automatically runs the rewrites to a normal form. 25 | However, the set of rewrites expressible is limited. 26 | In particular, context sensitive rewrites are often not expressible. 27 | 28 | Racket's macro system is close to enabling the context sensitive rewrites. 29 | The macro system tracks various context, such as expression context and 30 | definition context, and macros can provide hooks to perform context sensitive 31 | rewrites. 32 | 33 | Macro systems differ in a key way: they are open recursive. 34 | This allows macros to generate new macros. 35 | Supporting this would require the reduction system be open recursive and 36 | support a first-class notion of reduction rule. 37 | This is unlike anything the author has seen in programming languages theory, 38 | and seems likely to be quite difficult to reason about. 39 | 40 | In other respects, macro systems are similar. 41 | Macro system elaborate all code through a series reductions to a normal form, 42 | after which a traditional interpreter or compiler takes over. 43 | The process of macro expansion is performed left-to-right, outside-in, like 44 | most implementations of reduction systems. 45 | 46 | 47 | The nanopass framework supports a sophisticated pattern language for expressing 48 | transformations that is quite like the deterministic implementation of rewrite 49 | systems we presented. 50 | Each translation defines one translation helper for each non-terminal in the language. 51 | The pattern language supports a catamorphism pattern, which applies the 52 | appropriate translation helper for the sub-expression's non-terminal. 53 | This nearly allows the compiler writer to express the compiler as a reduction 54 | system, although the compiler writer is not free to rely on non-deterministic 55 | rules, or to write multi-language terms that normalize to the target language. 56 | - The semantics goes someway to providing insight into the process of 57 | bootstrapping, the often mind-bending process of compiling a compiler with 58 | itself. 59 | Bootstrapped compilers can include kind of mysterious translations, such as: 60 | ``` 61 | (match x 62 | ['#t #t] 63 | ['#f #f] 64 | ...) 65 | ``` 66 | which we all know to be generating the bit-level representation of booleans 67 | from their symbolic representation. 68 | This relies on the fact that the bootstrapped compiler already knows how to 69 | compile the symbol '#t to the bit-level representation #b000110, which we can 70 | refer to directly by the symbol #t. 71 | In a normalizing rewrite system, we can generate #t in the multi-language 72 | reduction relying on the reduction system to normalize that to its bit-level 73 | representation in a later step of normalization. 74 | - Read this. Then reread it. 75 | https://docs.racket-lang.org/redex/redex2015.html?q=subst/proc&q=extended-language#%28part._mon-mor%29 76 | There's something deep I haven't quite understood. 77 | -------------------------------------------------------------------------------- /cc.rkt: -------------------------------------------------------------------------------- 1 | #lang racket/base 2 | (require 3 | racket/string 4 | racket/set) 5 | 6 | (require 7 | redex/reduction-semantics 8 | racket/syntax 9 | "lambda-s.rkt" 10 | "lambda-cc.rkt") 11 | 12 | (provide (all-defined-out)) 13 | 14 | ; Design pattern for a multi-language with syntactic distinction between source 15 | ; and target, but also a combined syntax. 16 | (define-union-language tagCCL (S. λiL) (T. λccL)) 17 | (define-union-language mergeCCL λiL λccL) 18 | (define-union-language preCCL mergeCCL tagCCL) 19 | ; For left-to-right translation order, to make the translation faster and less 20 | ; non-deterministic. 21 | (define-extended-language CCL preCCL 22 | [T ::= hole 23 | (pletrec ([T.x (λ (T.x ...) T.e)] 24 | ... 25 | [T.x (λ (T.x ...) T)] 26 | [x (λ (x ...) e)] 27 | ...) 28 | e) 29 | (pletrec ([T.x (λ (T.x ...) T.e)] ...) 30 | T) 31 | (cletrec ([T.x (closure T.e ...)] 32 | ... 33 | [T.x (closure T.e ... T e ...)] 34 | [x (closure e ...)] ...) 35 | e) 36 | (cletrec ([T.x (closure T.e ...)] ...) 37 | T) 38 | (apply-closure T.e ... T e ...) 39 | (kw T.e ... T e ...) 40 | (let ([T.x_1 T.e] 41 | ... 42 | [x_i T] 43 | [x_n e] ...) 44 | e) 45 | (let ([x_1 T.e] ...) 46 | T)] 47 | [kw ::= begin void eq? pair? fixnum? boolean? procedure? box? void? < + - * 48 | cons car cdr box unbox set-box! if]) 49 | 50 | (define-metafunction CCL 51 | free-vars : any -> (x ...) 52 | [(free-vars x) (x)] 53 | [(free-vars (λ (x ...) e)) 54 | (free-vars ((substitute e x 0) ...))] 55 | [(free-vars (letrec ([x_1 any_1] ...) e)) 56 | (free-vars ((substitute e x_1 0) 57 | ... 58 | (substitute (any_1 ...) x_1 0) 59 | ...))] 60 | [(free-vars (pletrec ([x_1 any_1] ...) e)) 61 | (free-vars ((substitute e x_1 0) 62 | ... 63 | (substitute (any_1 ...) x_1 0) 64 | ...))] 65 | [(free-vars (cletrec ([x_1 any_1] ...) e)) 66 | (free-vars ((substitute e x_1 0) 67 | ... 68 | (substitute (any_1 ...) x_1 0) 69 | ...))] 70 | [(free-vars (let ([x_1 e_1] ...) e_2)) 71 | (free-vars ((substitute e_2 x_1 0) 72 | ... 73 | e_1 ...))] 74 | [(free-vars (any ...)) 75 | ,(set-union '() (term (x ... ...))) 76 | (where ((x ...) ...) ((free-vars any) ...))] 77 | [(free-vars any) ()]) 78 | 79 | (define (redex-id-get-base-name x) 80 | (car (string-split (symbol->string x) "«"))) 81 | 82 | (define (fresh-id x) 83 | (format-symbol "~a" (redex-id-get-base-name x))) 84 | 85 | (define (fresh-label x) 86 | (format-symbol "~aL" (redex-id-get-base-name x))) 87 | 88 | (define cc-> 89 | (reduction-relation 90 | CCL 91 | #:domain e 92 | #:codomain e 93 | 94 | (--> 95 | (letrec ([x_f (λ (x ...) e_1)] ...) 96 | e_2) 97 | (pletrec ([x_fl (λ (x_c x ...) 98 | (let ([x_f0 (closure-ref x_cc e_i)] 99 | ...) 100 | e_1))] 101 | ...) 102 | (cletrec ([x_f (closure x_fl x_f0 ...)] 103 | ...) 104 | e_2)) 105 | (where (x_fl ...) ,(map fresh-label (term (x_f ...)))) 106 | (where (x_c ...) ,(map fresh-id (map (lambda _ 'c) (term (x_f ...))))) 107 | (where ((x_f0 ...) ...) ((free-vars (λ (x ...) e_1)) ...)) 108 | (where ((x_cc ...) ...) 109 | ,(for/list ([ls (term ((x_f0 ...) ...))] 110 | [x_c (term (x_c ...))]) 111 | (for/list ([_ ls]) 112 | x_c))) 113 | (where ((e_i ...) ...) 114 | ,(for/list ([ls (term ((x_f0 ...) ...))]) 115 | (build-list (length ls) values)))) 116 | 117 | (--> (e_1 e ...) (apply-closure e_1 e_1 e ...)))) 118 | 119 | #;(define cc->+ (compatible-closure cc-> CCL e)) 120 | (define cc->+ (context-closure cc-> CCL T)) 121 | 122 | (parameterize ([default-language CCL]) 123 | (test-->> 124 | cc->+ 125 | #:equiv alpha-equivalent? 126 | (term 127 | (letrec ([fact (λ (n) 128 | (if (eq? n 0) 129 | 1 130 | (* n (fact (- n 1)))))]) 131 | (fact 5))) 132 | (term 133 | (pletrec ([fact-label 134 | (λ (c x) 135 | (let ([fact (closure-ref c 0)]) 136 | (if (eq? x 0) 137 | 1 138 | (* x (apply-closure fact fact (- x 1))))))]) 139 | (cletrec ([fact (closure fact-label fact)]) 140 | (apply-closure fact fact 5))))) 141 | 142 | (test-->> 143 | cc->+ 144 | #:equiv alpha-equivalent? 145 | (term s-eg) 146 | (term 147 | (let ((x (box 0))) 148 | (pletrec 149 | ((factL 150 | (λ (c n) 151 | (let ((fact (closure-ref c 0))) 152 | (if (eq? n 0) 153 | 1 154 | (* n (apply-closure fact fact (- n 1)))))))) 155 | (cletrec 156 | ((fact (closure factL fact))) 157 | (pletrec 158 | ((even?L 159 | (λ (c n) 160 | (let ((odd? (closure-ref c 0))) 161 | (if (eq? n 0) 162 | #t 163 | (apply-closure odd? odd? (- n 1)))))) 164 | (odd?L 165 | (λ (c n) 166 | (let ((even? (closure-ref c 0))) 167 | (if (eq? n 0) 168 | #f 169 | (apply-closure even? even? (- n 1))))))) 170 | (cletrec 171 | ((even? (closure even?L odd?)) 172 | (odd? (closure odd?L even?))) 173 | (begin 174 | (if (apply-closure 175 | even? 176 | even? 177 | (apply-closure fact fact 5)) 178 | (pletrec 179 | ((lengthL 180 | (λ (c x) 181 | (let ((length (closure-ref c 0))) 182 | (pletrec 183 | ((empty?L (λ (c x) (let () (eq? x '()))))) 184 | (cletrec 185 | ((empty? (closure empty?L))) 186 | (if (pair? x) 187 | (if (apply-closure empty? empty? x) 188 | 0 189 | (+ 190 | 1 191 | (apply-closure 192 | length 193 | length 194 | (cdr x)))) 195 | -1))))))) 196 | (cletrec 197 | ((length (closure lengthL length))) 198 | (set-box! 199 | x 200 | (apply-closure length length (cons 1 '()))))) 201 | (set-box! x (cons 2 '()))) 202 | (unbox x))))))))) 203 | ) 204 | 205 | ;; TODO: Add reduction relations, do union-reduction-relation, and show simulation. 206 | -------------------------------------------------------------------------------- /lfcs-2021-seminar-talk/snippets.tex: -------------------------------------------------------------------------------- 1 | \documentclass{article} 2 | \usepackage{mathpartir} 3 | \usepackage{amsmath} 4 | \usepackage{mttex/mttex} 5 | \usepackage{bm} 6 | 7 | \newcommand{\sfontnonterm}[1]{\sfont{\emph{#1}}} 8 | \newlanguage{\scolor}{\sfontnonterm}{\sfontsym}{s} 9 | {x,e,v,E} 10 | {} 11 | {} 12 | {} 13 | 14 | \newcommand{\sfune}[2]{(\sfontsym{\lambda}#1.#2)} 15 | \newcommand{\sappe}[2]{(#1\; #2)} 16 | \newcommand{\ssetbange}[2]{(\sfont{set!}\; #1\; #2)} 17 | \newcommand{\slete}[2]{(\sfont{let}\; (#1)\; #2)} 18 | \newcommand{\sbegine}[2]{(\sfont{begin}\; #1\; #2)} 19 | \newcommand{\SAe}[1]{\sfont{S}\tfont{A}(#1)} 20 | 21 | \newcommand{\tfontnonterm}[1]{\emph{\tcolor{$\bm{#1}$}}} 22 | \newlanguage{\tcolor}{\tfontnonterm}{\tfontsym}{t} 23 | {x,M,V,N,E} 24 | {} 25 | {} 26 | {} 27 | 28 | 29 | \newcommand{\tfune}[2]{\tfontsym{\lambda}#1. #2} 30 | 31 | \begin{document} 32 | 33 | $$ 34 | \begin{array}{rcl} 35 | \se & ::= & \sx \mid \sfune{\sx}{\se} \mid \sappe{\se}{\se} \mid \ssetbange{\sx}{\se} \mid \slete{[\sx\; \se]\; \dots}{\se} \mid \sbegine{\se}{\dots} \mid .... \\&& 36 | \mid \SAe{\tM} 37 | \end{array} 38 | $$ 39 | 40 | 41 | $$ 42 | \begin{array}{rcl} 43 | V & ::= & x \mid (\lambda x.M) \mid \textrm{true} \mid \textrm{false} \mid .... \\ && AS(e) \\ 44 | N & ::= & V \mid (V\; V) \mid (\textrm{set!}\; x\; V) \mid .... \\ && AS(e) \\ 45 | M & ::= & N \mid (\textrm{let}\; ([x\; N]\; ...)\; M) \mid (\textrm{begin}\; N\; ...\; M) \mid (\textrm{if}\; V\; M\; M) \mid .... \\ && AS(e) \\ 46 | \end{array} 47 | $$ 48 | 49 | $$ 50 | \begin{array}{rcl} 51 | v & ::= & (\lambda x.e) \mid true \mid .... \\ 52 | E & ::= & [\cdot] \mid E\; e \mid v\; E \mid .... \\ 53 | H & ::= & \cdot \mid H,x \mapsto v 54 | \end{array} 55 | $$ 56 | 57 | \fbox{$H;\; e \to H;\; e$} 58 | \begin{mathpar} 59 | \begin{array}{rcl} 60 | (\lambda x.e)\; v & \to & e[x\; := v] \\ 61 | & \vdots 62 | 63 | \end{array} 64 | 65 | \inferrule 66 | {e \to e'} 67 | {E[e] \to E[e']} 68 | \end{mathpar} 69 | 70 | 71 | $$ 72 | \begin{array}{rcl} 73 | E & ::= & [\cdot] \mid (\textrm{let} ([x\; V]\; ... [x\; [\cdot]] [x\; N]\; ...)\; M) \\ 74 | H & ::= & \cdot \mid H,x \mapsto V 75 | \end{array} 76 | $$ 77 | 78 | 79 | \fbox{$H;\; M \to H;\; M$} 80 | \begin{mathpar} 81 | \begin{array}{rclr} 82 | (\lambda x.M)\; V & \to & M[x\; := V] \\ 83 | &\vdots& \\ 84 | (\textrm{let}\; ([x\; V]\; ...)\; M) & \to & M[x := V\; ...] 85 | \end{array} 86 | \end{mathpar} 87 | 88 | \fbox{$e \to^a e$} 89 | \begin{mathpar} 90 | \begin{array}{rclr} 91 | M & \to^a & SA(M) & \textrm{[A-normal]} \\[2pt] 92 | 93 | E[(\textrm{let}\; ([x\; e]\; ...)\; e_2] & \to^a & 94 | SA(\textrm{let}\; ([x\; AS(e)]\; ...)\; AS(E[e_2])) & \textrm{[A-merge-let]} \\[2pt] 95 | 96 | E[(\textrm{begin}\; e_1\; ...\; e_2)] & \to^a & 97 | SA(\textrm{begin}\; AS(e_1)\; ...\; AS(E[e_2])) & \textrm{[A-merge-begin]} \\[2pt] 98 | 99 | E[N] & \to^a & SA(\textrm{let}\; ([x\; N])\; AS(E[x])) & \textrm{[A-lift]} \\ 100 | && \text{where $E$ is not an ``$N$ accepting target context''} \\ 101 | && \text{and $N$ is not a $V$} \\ 102 | & \vdots & 103 | \end{array} 104 | \end{mathpar} 105 | 106 | \begin{mathpar} 107 | \begin{array}{rcl} 108 | p & ::= & e \mid M \\ 109 | T & ::= & AS(C_M) \\ 110 | C_M & ::= & [\cdot] \mid 111 | (\textrm{let}\; ([x\; N]\; ...) C_M) \mid .... 112 | \end{array} 113 | \end{mathpar} 114 | 115 | \fbox{$M\; {}^s\!\!\to^a M$} 116 | \begin{mathpar} 117 | \inferrule 118 | {p \to^a p'} 119 | {C[T[p]]\; {}^s\!\!\to^a C[T[p']]} 120 | 121 | \inferrule 122 | {~} 123 | {C[SA(AS(e))] \;{}^s\!\!\to^a C[e]} 124 | 125 | \inferrule 126 | {~} 127 | {C[AS(SA(M))] \;{}^s\!\!\to^a C[M]} 128 | \end{mathpar} 129 | 130 | \fbox{$H;\; p \to^p H;\; p$} 131 | \begin{mathpar} 132 | \inferrule*[right=\text{Step-S1}] 133 | {H\; e \to H';\; e'} 134 | {H;\; e \to^p H';\; e'} 135 | 136 | \inferrule*[right=\text{Step-S2}] 137 | {H\; e \to H';\; e'} 138 | {H;\; AS(e) \to^p H';\; AS(e')} 139 | 140 | \inferrule*[right=\text{Step-T1}] 141 | {H\; M \to H';\; M'} 142 | {H;\; M \to^p H';\; M'} 143 | 144 | \inferrule*[right=\text{Step-T2}] 145 | {H\; M \to H';\; M'} 146 | {H;\; SA(M) \to^p H';\; SA(M')} 147 | 148 | \inferrule*[right=\text{Step-Across}] 149 | {M \; {}^s\!\!\to^a M'} 150 | {H;\; M \to^p H;\; M'} 151 | \end{mathpar} 152 | 153 | 154 | \fbox{$e \Downarrow_{AOT} M$} 155 | \begin{mathpar} 156 | \inferrule 157 | {AS(e) \; {}^s\!\!\to^a\!\!*\; M 158 | \\ 159 | M \; {}^s\!\!\not\to^a M'} 160 | {e \Downarrow_{AOT} M} 161 | \end{mathpar} 162 | 163 | \newtheorem{proposition}{Proposition} 164 | \newtheorem{corollary}{Corollary} 165 | \newtheorem{theorem}{Theorem} 166 | 167 | \begin{proposition}[Confluence] 168 | ~\\ 169 | If $H; p \to^{p*} H_1; p_1$ and 170 | $H; p \to^{p*} H_2; p_2$ then \\ 171 | $H_1; p_1 \to^{p*} H'; p'$ and 172 | $H_2; p_2 \to^{p*} H'; p'$. 173 | \end{proposition} 174 | 175 | \begin{corollary}[Whole-Program Correctness] 176 | ~\\ 177 | If $\cdot; e \to^{p*} H_1; v$ and 178 | $e \Downarrow_{AOT} M$ then \\ 179 | $\cdot; M \to^{p*} H_2; v$ 180 | \end{corollary} 181 | 182 | \begin{proposition}[Type Preservation] 183 | ~\\ 184 | If $\Gamma \vdash e : A$ and $e \Downarrow_{AOT} M$ then $\Gamma' \vdash M : B$ (ish) 185 | \end{proposition} 186 | 187 | \begin{proposition}[Subject Reduction] 188 | ~\\ 189 | If $\Gamma \vdash e : A$ and $H; e \to^{p} H'; e'$ then $\Gamma \vdash e' : A$ 190 | \end{proposition} 191 | 192 | \begin{theorem}[Subject Reduction Implies Type Safety] 193 | ~\\ 194 | If ($\Gamma \vdash e : A$ and $H; e \to^{p} H'; e'$ implies $\Gamma \vdash e' : A$) then\\ 195 | ($\Gamma \vdash e : A$ and $e \Downarrow_{AOT} M$ implies $\Gamma' \vdash M : B$) 196 | \end{theorem} 197 | 198 | \newtheorem{definition}{Definition} 199 | \newtheorem{lemma}{Lemma} 200 | 201 | \begin{definition}[Contextual Equivalence] 202 | ~\\ 203 | \begin{displaymath} 204 | \begin{array}{rclcl} 205 | H_1; p_1 & \approx_{SA} & H_2; p_2 &\defeq& \forall C. H_1; C[p_1] \Downarrow^p \iff H_2; C[p_2] \Downarrow^p\\ 206 | H_1; e_1 & \approx_{S} & H_2; e_2 &\defeq& \forall C_S. H_1; C_S[e_1] \Downarrow^s \iff H_2; C_S[e_2] \Downarrow^s\\ 207 | H_1; M_1 & \approx_{A} & H_2; M_2 &\defeq& \forall C_A. H_1; C_A[M_1] \Downarrow^a \iff H_2; C_A[M_2] \Downarrow^a 208 | \end{array} 209 | \end{displaymath} 210 | \end{definition} 211 | 212 | \begin{proposition}[Full Abstraction] 213 | ~\\ 214 | Suppose $e_1 \Downarrow_{AOT} M_1$ and $e_2 \Downarrow_{AOT} M_2$.\\ 215 | $H_1; e_1 \approx_S H_2; e_2$ if and only if $H_1; M_1 \approx_A H_2; M_2$. 216 | \end{proposition} 217 | 218 | \begin{lemma}[Full Abstraction (hard part)] 219 | ~\\ 220 | Suppose $e_1 \Downarrow_{AOT} M_1$ and $e_2 \Downarrow_{AOT} M_2$.\\ 221 | $H_1; e_1 \approx_S H_2; e_2$ if and only if $H_1; M_1 \approx_{SA} H_2; M_2$. 222 | \end{lemma} 223 | 224 | \begin{lemma}[Full Abstraction (easy part)] 225 | ~\\ 226 | Suppose $e_1 \Downarrow_{AOT} M_1$ and $e_2 \Downarrow_{AOT} M_2$.\\ 227 | $H_1; e_1 \approx_{SA} H_2; e_2$ if and only if $H_1; M_1 \approx_{A} H_2; M_2$. 228 | \end{lemma} 229 | 230 | 231 | 232 | \end{document} 233 | -------------------------------------------------------------------------------- /lambda-a.rkt: -------------------------------------------------------------------------------- 1 | #lang racket/base 2 | (require 3 | "base.rkt" 4 | "lambda-s.rkt" 5 | redex/reduction-semantics) 6 | 7 | (provide (all-defined-out)) 8 | 9 | ; λaL is the ANF language. 10 | (define-extended-language λaL baseL 11 | [v ::= '() fixnum boolean (void) x] 12 | [n ::= v (v ...) (primop v v ...)] 13 | [e ::= n (letrec ([x (λ (x ...) e)] ...) e) (let ([x n] ...) e) (begin n ... e) (if v e e)] 14 | 15 | #;[Cm ::= (compatible-closure-context e)] 16 | #;[Cn ::= (compatible-closure-context e #:wrt n)] 17 | #;[Cv ::= (compatible-closure-context e #:wrt V)] 18 | [Cv ::= Cn 19 | (in-hole Cn (v ... hole v ...)) 20 | (in-hole Cn (primop v ... hole v ...)) 21 | (in-hole Cm (if hole e e))] 22 | [Cn ::= Cm 23 | (in-hole Cm (let ([x n] ... [x hole] [x n] ...) e)) 24 | (in-hole Cm (begin n ... hole n ... e))] 25 | [Cm ::= hole 26 | (let ([x n] ...) Cm) 27 | (letrec ([x (λ (x ...) e)] ... 28 | [x (λ (x ...) Cm)] 29 | [x (λ (x ...) e)] ...) e) 30 | (letrec ([x (λ (x ...) e)] ...) Cm) 31 | (begin n ... Cm) 32 | (if v Cm e) 33 | (if v e Cm)] 34 | 35 | ;; For display only 36 | [Γ ::= ∅] 37 | [τ ::= ∅] 38 | 39 | #:binding-forms 40 | (λ (x ...) e #:refers-to (shadow x ...)) 41 | (letrec ([x any] ...) #:refers-to (shadow x ...) 42 | e #:refers-to (shadow x ...)) 43 | (let ([x e_1] ...) e_2 #:refers-to (shadow x ...))) 44 | 45 | (define-extended-language λaL-eval λaL 46 | [S ::= env] ; must be a dict of labels to values 47 | [E ::= (let ([x v] ... [x hole] [x n] ...) e) (begin v ... hole e ...)] 48 | [v ::= .... l] 49 | [fv ::= (λ (x ...) e)] 50 | [hv ::= v fv (pair v v) (box v)]) 51 | 52 | (define λa->composition 53 | (reduction-relation 54 | λaL-eval 55 | #:domain (S e) 56 | #:codomain (S e) 57 | #:arrow -->λa 58 | 59 | (-->λa (S (let ([x v] ...) e)) 60 | (S (subst-all e (x ...) (v ...)))) 61 | 62 | (-->λa (S_1 (letrec ([x fv] ...) e)) 63 | (S_2 (in-hole E (subst-all e (x ...) (l ...)))) 64 | 65 | (where (l ...) (fresh-labels x ...)) 66 | (where (v_1 ...) ((subst-all fv (x ...) (l ...)) ...)) 67 | (where S_2 (store-extend S_1 (l v_1) ...))) 68 | 69 | (-->λa (S (begin v ... e)) 70 | (S e)))) 71 | 72 | (define-metafunction λaL-eval 73 | hcompose : E e -> e 74 | [(hcompose E n) (in-hole E n)] 75 | [(hcompose E (let ([x n] ...) e)) 76 | (let ([x n] ...) (hcompose E e))] 77 | [(hcompose E (begin n ... e)) 78 | (begin n ... (hcompose E e))] 79 | [(hcompose E (if v e_1 e_2)) 80 | (if v (hcompose E e_1) (hcompose E e_2))] 81 | [(hcompose E (letrec ([x any] ...) e)) 82 | (letrec ([x any] ...) (hcompose E e))]) 83 | 84 | (define λa->admin 85 | (reduction-relation 86 | λaL-eval 87 | #:domain (S e) 88 | #:codomain (S e) 89 | #:arrow -->λa 90 | 91 | (-->λa (S (in-hole E (l v ..._1))) 92 | (S (hcompose E (subst-all e (x ...) (v ...)))) 93 | (where (λ (x ..._1) e) (store-ref S l))) 94 | 95 | (-->λa (S (in-hole E (l v ...))) 96 | (S (error)) 97 | (where (λ (x ...) e) (store-ref S l)) 98 | (side-condition (term (arity-error (x ...) (v ...))))) 99 | 100 | (-->λa (S (in-hole E (l v ...))) 101 | (S (error)) 102 | (where hv (store-ref S l)) 103 | (side-condition (term (non-fv? hv)))))) 104 | 105 | (define λa->bools 106 | (reduction-relation 107 | λaL-eval 108 | #:domain (S e) 109 | #:codomain (S e) 110 | #:arrow -->λa 111 | 112 | (-->λa (S (if #f e_1 e_2)) 113 | (S e_2)) 114 | (-->λa (S (if v e_1 e_2)) 115 | (S e_1) 116 | (side-condition (term (non-false? v)))) 117 | 118 | (-->λa (S (in-hole E (boolean? #t))) 119 | (S (in-hole E #t))) 120 | (-->λa (S (in-hole E (boolean? #f))) 121 | (S (in-hole E #t))) 122 | (-->λa (S (in-hole E (boolean? v))) 123 | (S (in-hole E #f)) 124 | (side-condition (term (non-boolean? v)))))) 125 | 126 | (define λa->boxes 127 | (reduction-relation 128 | λaL-eval 129 | #:domain (S e) 130 | #:codomain (S e) 131 | #:arrow -->λa 132 | 133 | ;; Boxes 134 | (-->λa (S (in-hole E (box v))) 135 | (S_1 (in-hole E l)) 136 | (where l ,(fresh-label)) 137 | (where S_1 (store-set S l (box v)))) 138 | (-->λa (S (in-hole E (unbox l))) 139 | (S (in-hole E v)) 140 | (where (box v) (store-ref S l))) 141 | (-->λa (S (in-hole E (unbox v))) 142 | (S (error)) 143 | (side-condition (box-error? (term S) (term v)))) 144 | (-->λa (S_1 (in-hole E (set-box! l v))) 145 | (S_2 (in-hole E (void))) 146 | (where S_2 (store-set S_1 l (box v)))) 147 | (-->λa (S (in-hole E (box? l))) 148 | (S (in-hole E #t)) 149 | (where (box v) (store-ref S l))) 150 | (-->λa (S (in-hole E (box? v))) 151 | (S (in-hole E #f)) 152 | (side-condition (box-error? (term S) (term v)))))) 153 | 154 | (define λa->pairs 155 | (reduction-relation 156 | λaL-eval 157 | #:domain (S e) 158 | #:codomain (S e) 159 | #:arrow -->λa 160 | 161 | ;; Pairs 162 | (-->λa (S (in-hole E (pair v_1 v_2))) 163 | (S_1 (in-hole E l)) 164 | (where S_1 (store-set S l (pair v_1 v_2))) 165 | (fresh l)) 166 | (-->λa (S (in-hole E (first l))) 167 | (S (in-hole E v_1)) 168 | (where (pair v_1 v_2) (store-ref S l))) 169 | (-->λa (S (in-hole E (first v))) 170 | (S (error)) 171 | (side-condition (pair-error? (term v)))) 172 | (-->λa (S (in-hole E (second l))) 173 | (S (in-hole E v_2)) 174 | (where (pair v_1 v_2) (store-ref S l))) 175 | (-->λa (S (in-hole E (second v))) 176 | (S (error)) 177 | (side-condition (pair-error? (term v)))) 178 | (-->λa (S (in-hole E (pair? l))) 179 | (S (in-hole E #t)) 180 | (where (pair v_1 v_2) (store-ref S l))) 181 | (-->λa (S (in-hole E (pair? v))) 182 | (S (in-hole E #f)) 183 | (side-condition (pair-error? (term v)))))) 184 | 185 | (define λa->arith 186 | (reduction-relation 187 | λaL-eval 188 | #:domain (S e) 189 | #:codomain (S e) 190 | #:arrow -->λa 191 | 192 | ;; Arith 193 | (-->λa (S (in-hole E (arith-op fixnum_1 fixnum_2))) 194 | (S (in-hole E v)) 195 | (where v (denote arith-op fixnum_1 fixnum_2))) 196 | 197 | (-->λa (S (in-hole E (arith-op v_1 v_2))) 198 | (S (error)) 199 | (side-condition (term (non-fixnum? v_1)))) 200 | (-->λa (S (in-hole E (arith-op v_1 v_2))) 201 | (S (error)) 202 | (side-condition (term (non-fixnum? v_1)))) 203 | (-->λa (S (in-hole E (fixnum? fixnum_1))) 204 | (S (in-hole E #t))) 205 | (-->λa (S (in-hole E (fixnum? v))) 206 | (S (in-hole E #f)) 207 | (side-condition (term (non-fixnum? v)))))) 208 | 209 | (define λa->eq 210 | (reduction-relation 211 | λaL-eval 212 | #:domain (S e) 213 | #:codomain (S e) 214 | #:arrow -->λa 215 | 216 | ;; Eq 217 | (-->λa (S (in-hole E (eq? v v))) 218 | (S (in-hole E #t))) 219 | (-->λa (S (in-hole E (eq? v_!_1 v_!_1))) 220 | (S (in-hole E #f))))) 221 | 222 | (define λa-> 223 | (union-reduction-relations 224 | λa->composition 225 | λa->admin 226 | λa->bools 227 | λa->boxes 228 | λa->pairs 229 | λa->arith 230 | λa->eq)) 231 | -------------------------------------------------------------------------------- /lambda-b.rkt: -------------------------------------------------------------------------------- 1 | #lang racket/base 2 | (require 3 | redex/reduction-semantics) 4 | 5 | (provide (all-defined-out)) 6 | 7 | ; λbL is the bit-specified λ-calculus. 8 | (define-language λbL 9 | [p ::= (with-labels ([x (λ (x ...) e)] ...) 10 | p) 11 | e] 12 | [e ::= (call e e ...) 13 | x 14 | (begin e ... e) 15 | (alloc e tag) 16 | (immediate number tag) ; TODO restrict range 17 | (word number) 18 | (mref e e) 19 | (mset! e e e) 20 | (+ e e) (- e e) (* e e) 21 | (let ([x e] ...) e) 22 | (compare (flag e e) e e) 23 | (compare (tag-eq? e tag) e e) 24 | ] 25 | [flag ::= eq? <] 26 | ; TODO: Can't use 'bool? 27 | [tag ::= 'bool 'pair 'box 'void 'empty 'fixnum 'procedure] 28 | [x ::= variable-not-otherwise-mentioned] 29 | #:binding-forms 30 | (λ (x ...) e #:refers-to (shadow x ...)) 31 | 32 | (with-labels ([x any_1] ...) #:refers-to (shadow x ...) 33 | e_2 #:refers-to (shadow x ...)) 34 | 35 | (let ([x e_1] ...) e_2 #:refers-to (shadow x ...))) 36 | 37 | (define-term b-eg 38 | (with-labels 39 | ((factL 40 | (λ (c n) 41 | (let ((fact (mref c (immediate 0 'fixnum)))) 42 | (compare 43 | (eq? 44 | (compare 45 | (eq? n (immediate 0 'fixnum)) 46 | (immediate 1 'bool) 47 | (immediate 0 'bool)) 48 | (immediate 0 'bool)) 49 | (* 50 | n 51 | (call 52 | (mref fact (word 0)) 53 | fact 54 | (- n (immediate 1 'fixnum)))) 55 | (immediate 1 'fixnum))))) 56 | (even?L 57 | (λ (c n) 58 | (let ((odd? (mref c (immediate 0 'fixnum)))) 59 | (compare 60 | (eq? 61 | (compare 62 | (eq? n (immediate 0 'fixnum)) 63 | (immediate 1 'bool) 64 | (immediate 0 'bool)) 65 | (immediate 0 'bool)) 66 | (call 67 | (mref odd? (word 0)) 68 | odd? 69 | (- n (immediate 1 'fixnum))) 70 | (immediate 1 'bool))))) 71 | (odd?L 72 | (λ (c n) 73 | (let ((even? (mref c (immediate 0 'fixnum)))) 74 | (compare 75 | (eq? 76 | (compare 77 | (eq? n (immediate 0 'fixnum)) 78 | (immediate 1 'bool) 79 | (immediate 0 'bool)) 80 | (immediate 0 'bool)) 81 | (call 82 | (mref even? (word 0)) 83 | even? 84 | (- n (immediate 1 'fixnum))) 85 | (immediate 0 'bool))))) 86 | (lengthL 87 | (λ (c x) 88 | (let ((length (mref c (immediate 0 'fixnum)))) 89 | (let ((s (word 1))) 90 | (let ((empty?«4487» (alloc s 'procedure))) 91 | (begin 92 | (begin 93 | (mset! empty?«4487» (word 0) empty?L)) 94 | (compare 95 | (eq? 96 | (compare 97 | (tag-eq? x 'pair) 98 | (immediate 1 'bool) 99 | (immediate 0 'bool)) 100 | (immediate 0 'bool)) 101 | (immediate -1 'fixnum) 102 | (compare 103 | (eq? 104 | (call 105 | (mref empty?«4487» (word 0)) 106 | empty?«4487» 107 | x) 108 | (immediate 0 'bool)) 109 | (+ 110 | (immediate 1 'fixnum) 111 | (call 112 | (mref length (word 0)) 113 | length 114 | (mref x (word 1)))) 115 | (immediate 0 'fixnum))))))))) 116 | (empty?L 117 | (λ (c x) 118 | (let () 119 | (compare 120 | (eq? x (immediate 0 'empty)) 121 | (immediate 1 'bool) 122 | (immediate 0 'bool)))))) 123 | (let ((x1 (alloc (word 1) 'box))) 124 | (begin 125 | (mset! x1 (word 0) (immediate 0 'fixnum)) 126 | (let ((x x1)) 127 | (let ((s1 (word 2))) 128 | (let ((fact«8489» (alloc s1 'procedure))) 129 | (begin 130 | (begin 131 | (mset! fact«8489» (word 0) factL) 132 | (mset! fact«8489» (word 1) fact«8489»)) 133 | (let ((s2 (word 2)) (s3 (word 2))) 134 | (let ((even?«8742» (alloc s2 'procedure)) 135 | (odd?«8743» (alloc s3 'procedure))) 136 | (begin 137 | (begin 138 | (mset! even?«8742» (word 0) even?L) 139 | (mset! even?«8742» (word 1) odd?«8743»)) 140 | (begin 141 | (mset! odd?«8743» (word 0) odd?L) 142 | (mset! odd?«8743» (word 1) even?«8742»)) 143 | (begin 144 | (compare 145 | (eq? 146 | (call 147 | (mref even?«8742» (word 0)) 148 | even?«8742» 149 | (call 150 | (mref fact«8489» (word 0)) 151 | fact«8489» 152 | (immediate 5 'fixnum))) 153 | (immediate 0 'bool)) 154 | (mset! 155 | x 156 | (word 0) 157 | (let ((x (alloc (word 2) 'pair))) 158 | (begin 159 | (mset! 160 | x 161 | (word 0) 162 | (immediate 2 'fixnum)) 163 | (mset! 164 | x 165 | (word 1) 166 | (immediate 0 'empty)) 167 | x))) 168 | (let ((s4 (word 2))) 169 | (let ((length«12165» 170 | (alloc s4 'procedure))) 171 | (begin 172 | (begin 173 | (mset! 174 | length«12165» 175 | (word 0) 176 | lengthL) 177 | (mset! 178 | length«12165» 179 | (word 1) 180 | length«12165»)) 181 | (mset! 182 | x 183 | (word 0) 184 | (call 185 | (mref length«12165» (word 0)) 186 | length«12165» 187 | (let ((x 188 | (alloc (word 2) 'pair))) 189 | (begin 190 | (mset! 191 | x 192 | (word 0) 193 | (immediate 1 'fixnum)) 194 | (mset! 195 | x 196 | (word 1) 197 | (immediate 0 'empty)) 198 | x)))))))) 199 | (mref x (word 0))))))))))))) 200 | ) 201 | -------------------------------------------------------------------------------- /source.scrbl: -------------------------------------------------------------------------------- 1 | #lang scribble/acmart @acmsmall @nonacm @screen 2 | @(require 3 | (only-in pict vc-append) 4 | (only-in scribble/manual deftech tech) 5 | "lambda-s.rkt" 6 | "defs.rkt" 7 | "bib.rkt") 8 | 9 | @(require (only-in redex/pict render-term/pretty-write) (only-in redex/reduction-semantics term)) 10 | @(define-syntax-rule (render-src-eg e) 11 | (nested #:style 'code-inset 12 | (para "Example:") 13 | (tabular #:row-properties '((top)) (list (list "> " (render-src e)))) 14 | (with-paper-rewriters (render-term/pretty-write λiL (term (eval/print-λiL e)))))) 15 | 16 | @title{Source Language: @source-lang} 17 | We start by briefly introducing the source language, @|source-lang|. 18 | The language includes a handful of standard features and is loosely inspired by 19 | Scheme. 20 | Each feature is choosen to make our compiler more @deftech{interesting}, meaning 21 | realistic compilation to an assembly language requires additional non-trivial 22 | compilation passes. 23 | 24 | @section{Syntax} 25 | @figure["fig:src-syntax" @elem{@|source-lang| Syntax} 26 | (render-language λiL #:nts '(e x tag-pred arith-op)) 27 | ] 28 | 29 | We present the syntax of @source-lang in @Figure-ref{fig:src-syntax} 30 | 31 | Mutually recursive multi-arity functions are introduced by @render-src[letrec]. 32 | For simplicity of presentation, we require functions are named; it is simple to 33 | translate from a language with anonymous functions. 34 | 35 | We include an error primitive, @render-src[error], which simply raises an 36 | uncatachable error with no associated information. 37 | It would make our compiler more @tech{interesting} to include a catchable error, 38 | but maybe too interesting for our purposes. 39 | Adding associated information to the error does not make the compiler more 40 | @tech{interesting}. 41 | 42 | Mutable references are introduced by @render-src[box], updated with 43 | @render-src[set-box!], and dereferenced by @render-src[unbox]. 44 | Purely to support imperative features, we include @render-src[(void)] and the 45 | @render-src[begin] form. 46 | @render-src[begin] allows executing a sequence of imperative expressions 47 | without @render-src[let]-binding their unimportant result, and 48 | @render-src[(void)] represents the unit value, and is implicitly returned by an 49 | imperative primitive. 50 | 51 | Immutable pairs are introduced with @render-src[pair] and @render-src['()] (the 52 | empty pair), and destructed with @render-src[first] and @render-src[second]. 53 | We depart from the usual Scheme names, @tt{cons}, @tt{car}, and @tt{cdr}, that 54 | serve only to evoke past mistakes and confuse the uninitiated. 55 | Pairs serve to represent arbitrary sized, structured, non-immediate data that 56 | must be heap allocated by the compiler. 57 | Mutable references already force us to deal with allocation, but are 58 | insufficient to represent @tech{interesting} data structures. 59 | 60 | The language supports literal fixed-sized integers, @render-src[fixnum]s, and a 61 | few arithmetic operations, @render-src[arith-ops]: addition, subtraction, 62 | mulitplication, and division. 63 | In practice, @render-src[fixnum]s are less than the machine word-size due to 64 | object tagging, but this is not important for our model. 65 | We do not specify their range, but consider the language parameterized by 66 | some @render-src[fixnum] range. 67 | 68 | The language includes booleans literals @render-src[#t] for true and 69 | @render-src[#f] for false, eliminated by @render-src[if]. 70 | Booleans introduce a second immediate data type, and branching introduces minor 71 | but non-trivial complications in some passes. 72 | Both are useful for making the model compiler more @tech{interesting}. 73 | We also include two predicates, @render-src[<] for comparing 74 | @render-src[fixnum]s, and @render-src[eq?] for comparing two values for 75 | identity (pointer equality rather than structural equality). 76 | 77 | Finally, we add predicates for checking the tag on each of our data types. 78 | This forces our compiler to model object tagging, a detail often ignored in 79 | models, and definitely @tech{interesting}. 80 | 81 | The binding forms, @render-src[letrec], @render-src[let], and @render-src[λ], 82 | support multi-arity bindings that are assumed to be disjoint. 83 | 84 | @section{Static Semantics} 85 | We could add an ML-style type system if we wanted to study compilation with 86 | types, but that is not the focus of the present work, so we do not. 87 | 88 | All programs must be well bound, implementing The Scheme Type system. 89 | This is completely standard and we omit it for brevity. 90 | 91 | @section{Dynamic Semantics} 92 | @figure["fig:src-red-comp" @elem{@|source-lang| Reduction (excerpts)} 93 | (vc-append 94 | 25 95 | (render-language λiL-eval #:nts '(E S v fv hv)) 96 | (render-reduction-relation 97 | (union-reduction-relations 98 | λi->composition 99 | λi->arith 100 | λi->pairs 101 | λi->eq) #:style 'horizontal)) 102 | ] 103 | 104 | We present the reduction system using evaluation contexts@~cite{felleisen1992}. 105 | The language has completely standard left-to-right call-by-value operational 106 | semantics, specified in the evaluation context. 107 | For brevity, we abstract all primitive operators using @render-term[λiL primop]. 108 | We use a store to model @render-src[letrec], in addition to mutable references, 109 | and pairs. 110 | The store @render-term[λiL-eval S] maps an abstract label to heap values 111 | @render-term[λiL-eval S]. 112 | We define values to be the base values, plus labels. 113 | We allow names to be values for technical reasons discussed later, although they 114 | should never appear in evaluation position for whole programs. 115 | Heap values include all values, functions, pairs, and mutable boxes. 116 | 117 | Note that we do not consider funtions or pairs values in the usual sense. 118 | Instead, they act as effectful operators that perform allocation. 119 | This is more faithful to the how they are implemented in the compiler, and 120 | simplifies the implementation of various semantics, such as the @render-term[λiL 121 | eq?] operator. 122 | 123 | The reduction rules are standard, so we give a selection of rules. 124 | @render-term[λiL-eval letrec] allocates the entire mutually recursive block of 125 | functions at once, resolving all names in the block to their labels. 126 | Pair allocates a fresh label, while the first and second projections dereference 127 | the label from the store. 128 | 129 | All operations are dynamically checked to ensure type safety. 130 | 131 | @section{Examples} 132 | The language allows us to implement favorite example programs from the compilers 133 | literature, such as factorial. 134 | 135 | @(render-prefix-and-finish λiL-eval λi-> (λs->-arrow) 3 136 | (() 137 | (letrec ([fact (λ (n) 138 | (if (eq? n 0) 139 | 1 140 | (* n (fact (- n 1)))))]) 141 | (fact 5)))) 142 | 143 | Mutable references let us implement the standard example of two 144 | hopefully-observationally-equivalent counters that use local state. 145 | The below example is rendered with an implementation of a printer for the 146 | language which prints pairs properly. 147 | 148 | @render-src-eg[ 149 | (let ([counter (let ([b (box 0)]) 150 | (letrec ([counter-proc 151 | (λ () 152 | (begin 153 | (set-box! b (+ 1 (unbox b))) 154 | (unbox b)))]) 155 | counter-proc))] 156 | [slow-counter (let ([b (box 0)]) 157 | (letrec ([slow-counter-proc 158 | (λ () 159 | (begin 160 | (set-box! b (+ 2 (unbox b))) 161 | (/ (unbox b) 2)))]) 162 | slow-counter-proc))]) 163 | (pair (counter) 164 | (pair (counter) 165 | (pair (slow-counter) (slow-counter))))) 166 | ] 167 | 168 | @;We can also express weird programs that appear in real languages but rarely in 169 | @;papers, like using intensional polymorphism to define @render-src[++] as the 170 | @;commutative operation on all our data types: 171 | @; 172 | @;@render-src-eg[ 173 | @;(letrec ([++ (λ (d1 d2) 174 | @; (if (fixnum? d1) 175 | @; (+ d1 d2) 176 | @; (if (boolean? d1) 177 | @; (or d1 d2) 178 | @; (if (list? d1) 179 | @; (append d1 d2) 180 | @; (void)))))] 181 | @; [or (λ (b1 b2) (if b1 b1 b2))] 182 | @; [and (λ (b1 b2) (if b1 b2 false))] 183 | @; [list? (λ (l) 184 | @; (if (eq? empty l) 185 | @; true 186 | @; (and (pair? l) (list? (second l)))))] 187 | @; [append (λ (l1 l2) 188 | @; (if (eq? l1 empty) 189 | @; l2 190 | @; (pair (first l1) (append (second l1) l2))))]) 191 | @; (pair (++ 5 6) 192 | @; (pair 193 | @; (++ true false) 194 | @; (++ (pair 2 empty) (pair 1 empty))))) 195 | @;] 196 | -------------------------------------------------------------------------------- /lambda-s.rkt: -------------------------------------------------------------------------------- 1 | #lang racket/base 2 | (require 3 | "base.rkt" 4 | redex/reduction-semantics 5 | racket/list) 6 | 7 | (provide (all-defined-out)) 8 | 9 | ; λsL is the λ-calculus surface language. 10 | ; It's a Scheme-like language, but could be ML like if given a type system 11 | (define-language λsL 12 | [e ::= (λ (x ...) e) (e e ...) x 13 | (begin e ... e) (box e) (unbox e) 14 | (set-box! e) (pair e e) (car e) (cdr e) fixnum (+ e e) (* e e) 15 | (let ([x e] ...) e) (letrec ([x e] ...) e) 16 | (void) '() 17 | (if e e e) 18 | (eq? e e) 19 | (pair? e) 20 | (fixnum? e) 21 | (boolean? e) 22 | (procedure? e) 23 | (box? e) 24 | (void? e) 25 | (< e e) 26 | boolean] 27 | [x ::= variable-not-otherwise-mentioned] 28 | [fixnum ::= integer] ; TODO restrict to fixnum range 29 | #:binding-forms 30 | (λ (x ...) e #:refers-to (shadow x ...)) 31 | (letrec ([x e_1] ...) #:refers-to (shadow x ...) 32 | e_2 #:refers-to (shadow x ...)) 33 | (let ([x e_1] ...) e_2 #:refers-to (shadow x ...))) 34 | 35 | ; λiL is the λ-calculus internal language. 36 | ;; NOTE: whitespace sensitive for typesetting in the paper. 37 | (define-extended-language λiL baseL 38 | [e ::= (letrec ([x (λ (x ...) e)] ...) e) (e e ...) x 39 | (let ([x e] ...) e) (begin e ... e) (void) 40 | (box e) (unbox e) (set-box! e e) 41 | '() (pair e e) (first e) (second e) 42 | fixnum (binop e e) 43 | #t #f (if e e e) 44 | (tag-pred e) (error)] 45 | [x ::= variable-not-otherwise-mentioned] 46 | 47 | ;; for display 48 | [Γ ::= ∅] 49 | [τ ::= ∅] 50 | #:binding-forms 51 | (λ (x ...) e #:refers-to (shadow x ...)) 52 | (letrec ([x any] ...) #:refers-to (shadow x ...) 53 | e #:refers-to (shadow x ...)) 54 | (let ([x e_1] ...) e_2 #:refers-to (shadow x ...))) 55 | 56 | (define-extended-language λiL-eval λiL 57 | ;[S ::= env] ; must be a dict of labels to values 58 | [E ::= hole (let ([x v] ... [x E] [x e] ...) e) (begin v ... E e ...) 59 | (if E e e) (primop v ... E e ...) (v ... E e ...)] 60 | [S ::= ((l hv) ...)] 61 | [e ::= .... l] 62 | [v ::= fixnum boolean '() (void) l x] 63 | [fv ::= (λ (x ...) e)] 64 | [hv ::= v fv (pair v v) (box v)]) 65 | 66 | (define-metafunction λiL-eval 67 | store-extend : S (l hv) ... -> S 68 | [(store-extend S (l hv) ...) 69 | (env-extend S (l (hv)) ...)]) 70 | 71 | (define-metafunction λiL-eval 72 | store-ref : S l -> hv 73 | [(store-ref S l) ,(car (term (env-ref S l)))]) 74 | 75 | (define (box-error? S v) 76 | (or (not (redex-match? λiL-eval l v)) 77 | (not (redex-match? λiL-eval (box v) 78 | (term (store-ref ,S ,v)))))) 79 | 80 | (define (pair-error? v) 81 | (not (redex-match? λiL-eval (pair e_1 e_2) v))) 82 | 83 | (define-metafunction λiL-eval 84 | [(non-pair? any) 85 | ,(pair-error? (term any))]) 86 | 87 | ;; NOTE: These are split-up to make type setting easier. 88 | (define λi->composition 89 | (reduction-relation 90 | λiL-eval 91 | #:domain (S e) 92 | #:codomain (S e) 93 | #:arrow -->λs 94 | 95 | (-->λs (S (in-hole E (let ([x v] ...) e))) 96 | (S (in-hole E (subst-all e (x ...) (v ...))))) 97 | 98 | (-->λs (S_1 (in-hole E (letrec ([x fv] ...) e))) 99 | (S_2 (in-hole E (subst-all e (x ...) (l ...)))) 100 | 101 | (where (l ...) (fresh-labels x ...)) 102 | (where (fv_1 ...) ((subst-all fv (x ...) (l ...)) ...)) 103 | (where S_2 (store-extend S_1 (l fv_1) ...))) 104 | 105 | (-->λs (S (in-hole E (begin v ... e))) 106 | (S (in-hole E e))))) 107 | 108 | (define λi->proc 109 | (reduction-relation 110 | λiL-eval 111 | #:domain (S e) 112 | #:codomain (S e) 113 | #:arrow -->λs 114 | 115 | (-->λs (S (in-hole E (l v ..._1))) 116 | (S (in-hole E (subst-all e (x ...) (v ...)))) 117 | (where (λ (x ..._1) e) (store-ref S l))) 118 | 119 | (-->λs (S (in-hole E (l v ...))) 120 | (S (error)) 121 | (where (λ (x ...) e) (store-ref S l)) 122 | (side-condition (term (arity-error (x ...) (v ...))))) 123 | 124 | (-->λs (S (in-hole E (l v ...))) 125 | (S (error)) 126 | (where hv (store-ref S l)) 127 | (side-condition (term (non-fv? hv)))))) 128 | 129 | (define λi->bools 130 | (reduction-relation 131 | λiL-eval 132 | #:domain (S e) 133 | #:codomain (S e) 134 | #:arrow -->λs 135 | 136 | ;; Booleans 137 | (-->λs (S (in-hole E (if #f e_1 e_2))) 138 | (S (in-hole E e_2))) 139 | (-->λs (S (in-hole E (if v e_1 e_2))) 140 | (S (in-hole E e_1)) 141 | (side-condition (term (non-false? v)))) 142 | 143 | (-->λs (S (in-hole E (boolean? #t))) 144 | (S (in-hole E #t))) 145 | (-->λs (S (in-hole E (boolean? #f))) 146 | (S (in-hole E #t))) 147 | (-->λs (S (in-hole E (boolean? v))) 148 | (S (in-hole E #f)) 149 | (side-condition (term (non-boolean? v)))))) 150 | 151 | (define λi->boxes 152 | (reduction-relation 153 | λiL-eval 154 | #:domain (S e) 155 | #:codomain (S e) 156 | #:arrow -->λs 157 | 158 | ;; Boxes 159 | (-->λs (S (in-hole E (box v))) 160 | (S_1 (in-hole E lb)) 161 | (fresh lb) 162 | (where S_1 (store-extend S (lb (box v))))) 163 | (-->λs (S (in-hole E (unbox l))) 164 | (S (in-hole E v)) 165 | (where (box v) (store-ref S l))) 166 | (-->λs (S (in-hole E (unbox v))) 167 | (S (error)) 168 | (side-condition (box-error? (term S) (term v)))) 169 | (-->λs (S_1 (in-hole E (set-box! l v))) 170 | (S_2 (in-hole E (void))) 171 | (where S_2 (store-extend S_1 (l (box v))))) 172 | (-->λs (S (in-hole E (box? l))) 173 | (S (in-hole E #t)) 174 | (where (box v) (store-ref S l))) 175 | (-->λs (S (in-hole E (box? v))) 176 | (S (in-hole E #f)) 177 | (side-condition (box-error? (term S) (term v)))))) 178 | 179 | (define λi->pairs 180 | (reduction-relation 181 | λiL-eval 182 | #:domain (S e) 183 | #:codomain (S e) 184 | #:arrow -->λs 185 | 186 | ;; Pairs 187 | (-->λs (S (in-hole E (pair v_1 v_2))) 188 | (S_1 (in-hole E lb)) 189 | (fresh lb) 190 | (where S_1 (store-extend S (lb (pair v_1 v_2))))) 191 | (-->λs (S (in-hole E (first l))) 192 | (S (in-hole E v_1)) 193 | (where (pair v_1 v_2) (store-ref S l))) 194 | (-->λs (S (in-hole E (second l))) 195 | (S (in-hole E v_2)) 196 | (where (pair v_1 v_2) (store-ref S l))) 197 | (-->λs (S (in-hole E (pair? (pair v_1 v_2)))) 198 | (S (in-hole E #t))) 199 | (-->λs (S (in-hole E (pair? v))) 200 | (S (in-hole E #f)) 201 | (side-condition (term (non-pair? v)))) 202 | (-->λs (S (in-hole E (second v))) 203 | (S (error)) 204 | (side-condition (term (non-pair? v)))) 205 | (-->λs (S (in-hole E (first v))) 206 | (S (error)) 207 | (side-condition (term (non-pair? v)))) 208 | )) 209 | 210 | (define λi->error 211 | (reduction-relation 212 | λiL-eval 213 | #:domain (S e) 214 | #:codomain (S e) 215 | 216 | #:arrow -->λs 217 | (-->λs (S (in-hole E (error))) (S (error))))) 218 | 219 | (define λi->arith 220 | (reduction-relation 221 | λiL-eval 222 | #:domain (S e) 223 | #:codomain (S e) 224 | #:arrow -->λs 225 | 226 | ;; Arith 227 | (-->λs (S (in-hole E (arith-op fixnum_1 fixnum_2))) 228 | (S (in-hole E v)) 229 | (where v (denote arith-op fixnum_1 fixnum_2))) 230 | 231 | (-->λs (S (in-hole E (arith-op v_1 v_2))) 232 | (S (error)) 233 | (side-condition (term (non-fixnum? v_1)))) 234 | (-->λs (S (in-hole E (arith-op v_1 v_2))) 235 | (S (error)) 236 | (side-condition (term (non-fixnum? v_1)))) 237 | (-->λs (S (in-hole E (fixnum? fixnum_1))) 238 | (S (in-hole E #t))) 239 | (-->λs (S (in-hole E (fixnum? v))) 240 | (S (in-hole E #f)) 241 | (side-condition (term (non-fixnum? v)))))) 242 | 243 | (define λi->eq 244 | (reduction-relation 245 | λiL-eval 246 | #:domain (S e) 247 | #:codomain (S e) 248 | #:arrow -->λs 249 | 250 | ;; Eq 251 | (-->λs (S (in-hole E (eq? v v))) 252 | (S (in-hole E #t))) 253 | (-->λs (S (in-hole E (eq? v_1 v_2))) 254 | (S (in-hole E #f)) 255 | (side-condition (term (not-equal? v_1 v_2)))))) 256 | 257 | (define λi-> 258 | (union-reduction-relations 259 | λi->composition 260 | λi->proc 261 | λi->bools 262 | λi->boxes 263 | λi->pairs 264 | λi->arith 265 | λi->eq 266 | λi->error)) 267 | 268 | (define-metafunction λiL-eval 269 | print-λiL : S hv -> e 270 | [(print-λiL S l) 271 | (print-λiL S (store-ref S l))] 272 | [(print-λiL S (pair v_1 v_2)) 273 | (pair (print-λiL S v_1) (print-λiL S v_2))] 274 | [(print-λiL S (λ (x ...) e)) 275 | '] 276 | [(print-λiL S v) 277 | v]) 278 | 279 | (define-metafunction λiL-eval 280 | eval-λiL : e -> v 281 | [(eval-λiL e) 282 | ,(second (car (apply-reduction-relation* λi-> (term (() e)))))]) 283 | 284 | (define-metafunction λiL-eval 285 | eval/print-λiL : e -> e 286 | [(eval/print-λiL e) 287 | ,(let ([x (car (apply-reduction-relation* λi-> (term (() e))))]) 288 | (term (print-λiL ,(car x) ,(cadr x))))]) 289 | 290 | (define-term s-eg 291 | (let ([x (box 0)]) 292 | (letrec ([fact (λ (n) 293 | (if (eq? n 0) 294 | 1 295 | (* n (fact (- n 1)))))]) 296 | (letrec ([even? (λ (n) 297 | (if (eq? n 0) 298 | #t 299 | (odd? (- n 1))))] 300 | [odd? (λ (n) 301 | (if (eq? n 0) 302 | #f 303 | (even? (- n 1))))]) 304 | (begin 305 | (if (even? 1) 306 | (letrec ([length (λ (x) 307 | (letrec ([empty? (λ (x) (eq? x '()))]) 308 | (if (pair? x) 309 | (if (empty? x) 310 | 0 311 | (+ 1 (length (cdr x)))) 312 | (error))))]) 313 | (set-box! x (length (pair 1 '())))) 314 | (set-box! x (pair (fact 5) '()))) 315 | (unbox x)))))) 316 | 317 | (test-match λiL e (term s-eg)) 318 | 319 | (test-->> λi-> #:equiv (lambda (x y) 320 | (alpha-equivalent? λiL (second x) y)) 321 | (term (() (letrec ([fact (λ (n) 322 | (if (eq? n 0) 323 | 1 324 | (* n (fact (- n 1)))))]) 325 | (fact 5)))) 326 | (term 120)) 327 | 328 | 329 | (test-->> λi-> 330 | #:equiv (lambda (x y) 331 | (alpha-equivalent? λiL (second x) y)) 332 | (term 333 | (() 334 | (letrec ([even? (λ (n) 335 | (if (eq? n 0) 336 | #t 337 | (odd? (- n 1))))] 338 | [odd? (λ (n) 339 | (if (eq? n 0) 340 | #f 341 | (even? (- n 1))))] 342 | [and (λ (n m) (if n m #f))] 343 | [not (λ (n) (if n #f #t))]) 344 | (and 345 | (not (even? 5)) 346 | (and 347 | (even? 4) 348 | (even? 0)))))) 349 | (term #t)) 350 | 351 | (test-->> λi-> 352 | #:equiv (lambda (x y) 353 | (alpha-equivalent? λiL-eval (term (print-λiL ,(car x) ,(cadr x))) y)) 354 | (term 355 | (() 356 | (let ([x (box 5)]) 357 | (pair (unbox x) 358 | (pair 359 | (begin 360 | (set-box! x 6) 361 | (unbox x)) 362 | '()))))) 363 | (term (pair 5 (pair 6 '())))) 364 | 365 | (test-->> λi-> #:equiv (lambda (x y) 366 | (alpha-equivalent? λiL-eval (term (print-λiL ,(car x) ,(cadr x))) y)) 367 | (term (() s-eg)) (term (pair 120 '()))) 368 | -------------------------------------------------------------------------------- /specify.rkt: -------------------------------------------------------------------------------- 1 | #lang racket/base 2 | 3 | (require 4 | redex/reduction-semantics 5 | "lambda-h.rkt" 6 | "lambda-b.rkt") 7 | 8 | (provide (all-defined-out)) 9 | 10 | ; Design pattern for a multi-language with syntactic distinction between source 11 | ; and target, but also a combined syntax. 12 | (define-union-language tagSL (S. λhL) (T. λbL)) 13 | (define-union-language mergeSL λhL λbL) 14 | (define-union-language preSL mergeSL tagSL) 15 | 16 | (define-extended-language SL preSL 17 | [T ::= hole 18 | (with-labels ([T.x (λ (T.x ...) T.e)] 19 | ... 20 | [T.x (λ (T.x ...) T)] 21 | [x (λ (x ...) e)] 22 | ...) 23 | p) 24 | (with-labels ([T.x (λ (T.x ...) T.e)] 25 | ...) 26 | T) 27 | (kw T.e ... T e ...) 28 | (let ([T.x_1 T.e] 29 | ... 30 | [x_i T] 31 | [x_n e] ...) 32 | e) 33 | (let ([T.x_1 T.e] ...) 34 | T) 35 | (alloc T tag) 36 | (compare (tag-eq? T tag) e ...) 37 | (compare (tag-eq? T.e tag) T.e ... T e ...) 38 | (compare (flag T.e ... T e ...) e ...) 39 | (compare (flag T.e ...) T.e ... T e ...)] 40 | [kw ::= begin + - * mref mset! call] 41 | 42 | ; Spec, but slow 43 | #;[C ::= (compatible-closure-context e) (compatible-closure-context p #:wrt e)] 44 | [C ::= T]) 45 | 46 | (define s-> 47 | (reduction-relation 48 | SL 49 | #:domain p 50 | #:codomain p 51 | 52 | (--> (cletrec ([x (closure e ...)] ...) 53 | e_2) 54 | (let ([s (word e_size)] ...) 55 | (let ([x (alloc s 'procedure)] ...) 56 | (begin 57 | (begin e_set ...) 58 | ... 59 | e_2))) 60 | (fresh ((s ...) (x ...))) 61 | (where (e_size ...) ,(map length (term ((e ...) ...)))) 62 | (where ((e_index ...) ...) 63 | ,(for/list ([size (term (e_size ...))]) 64 | (build-list size values))) 65 | (where ((e_set ...) ...) 66 | ,(for/list ([x (term (x ...))] 67 | [ils (term ((e_index ...) ...))] 68 | [els (term ((e ...) ...))]) 69 | (for/list ([index ils] 70 | [e els]) 71 | (term (mset! ,x (word ,index) ,e)))))) 72 | (--> (closure-ref e_b e_i) (mref e_b e_i)) 73 | (--> (apply-closure e e_a ...) 74 | (let ([x (mref e (word 0))]) 75 | (call x e_a ...)) 76 | (fresh x)) 77 | (--> (procedure? e) 78 | (compare (tag-eq? e 'procedure) #t #f)) 79 | 80 | (--> 81 | (box e) 82 | (let ([x (alloc (word 1) 'box)]) 83 | (begin 84 | (mset! x (word 0) e) 85 | x)) 86 | (fresh x)) 87 | (--> 88 | (set-box! e_1 e_2) 89 | (mset! e_1 (word 0) e_2)) 90 | (--> 91 | (unbox e) 92 | (mref e (word 0))) 93 | (--> 94 | (box? e) 95 | (compare (tag-eq? e 'box) #t #f)) 96 | 97 | (--> #f (immediate 0 'bool)) 98 | (--> #t (immediate 1 'bool)) 99 | (--> (if e_1 e_2 e_3) 100 | (compare (eq? e_1 #f) e_3 e_2)) 101 | (--> (eq? e_1 e_2) 102 | (compare (eq? e_1 e_2) #t #f)) 103 | (--> (boolean? e) 104 | (compare (tag-eq? e 'bool) #t #f)) 105 | 106 | (--> number (immediate number 'fixnum)) 107 | (--> (fixnum? e) 108 | (compare (tag-eq? e 'fixnum) #t #f)) 109 | 110 | (--> (void) (immediate 0 'void)) 111 | (--> (void? e) 112 | (compare (tag-eq? e 'void) #t #f)) 113 | 114 | (--> '() (immediate 0 'empty)) 115 | (--> (empty? e) 116 | (compare (tag-eq? e 'empty) #t #f)) 117 | 118 | (--> (cons e_1 e_2) 119 | (let ([x (alloc (word 2) 'pair)]) 120 | (begin 121 | (mset! x (word 0) e_1) 122 | (mset! x (word 1) e_2) 123 | x))) 124 | (--> (car e_1) (mref e_1 (word 0))) 125 | (--> (cdr e_1) (mref e_1 (word 1))) 126 | (--> (pair? e) 127 | (compare (tag-eq? e 'pair) #t #f)) 128 | 129 | 130 | ;; Commuting conversions 131 | ;; Since we're in ANF, we can either go the F-to-TAL route and explicitly 132 | ;; manage declarations.. or just have separate commuting conversions, 133 | ;; which are easier to specify, and let normalization take care of it. 134 | (--> (let ([x_1 (let ([x_2 e_2]) e_1)]) 135 | e) 136 | (let ([x_2 e_2]) 137 | (let ([x_1 e_1]) 138 | e))) 139 | 140 | (--> (let ([x_1 (begin e_s ... e_1)]) 141 | e) 142 | 143 | (begin e_s ... 144 | (let ([x_1 e_1]) 145 | e))))) 146 | 147 | #| 148 | (compare (flag e_1 e_2) e_t e_f) 149 | --> 150 | (begin (jmp-if flag l_t) (jmp l_f) (with-label l_t e_t) (with-label l_f e_f)) 151 | 152 | 153 | (in-hole nontail (call l e ...)) 154 | --> 155 | (begin (push e) ... (push return) (jump l) (with-label return (pop x)) (in-hole nontail x)) 156 | |# 157 | (define s->+ (context-closure s-> SL C)) 158 | 159 | (current-cache-all? #t) 160 | 161 | (module+ test 162 | (parameterize ([default-language SL]) 163 | (test-->> 164 | s->+ 165 | #:equiv alpha-equivalent? 166 | (term h-eg) 167 | (term 168 | (with-labels 169 | ((factL 170 | (λ (c n) 171 | (let ((fact (mref c (immediate 0 'fixnum)))) 172 | (compare 173 | (eq? 174 | (compare 175 | (eq? n (immediate 0 'fixnum)) 176 | (immediate 1 'bool) 177 | (immediate 0 'bool)) 178 | (immediate 0 'bool)) 179 | (* 180 | n 181 | (call 182 | (mref fact (word 0)) 183 | fact 184 | (- n (immediate 1 'fixnum)))) 185 | (immediate 1 'fixnum))))) 186 | (even?L 187 | (λ (c n) 188 | (let ((odd? (mref c (immediate 0 'fixnum)))) 189 | (compare 190 | (eq? 191 | (compare 192 | (eq? n (immediate 0 'fixnum)) 193 | (immediate 1 'bool) 194 | (immediate 0 'bool)) 195 | (immediate 0 'bool)) 196 | (call 197 | (mref odd? (word 0)) 198 | odd? 199 | (- n (immediate 1 'fixnum))) 200 | (immediate 1 'bool))))) 201 | (odd?L 202 | (λ (c n) 203 | (let ((even? (mref c (immediate 0 'fixnum)))) 204 | (compare 205 | (eq? 206 | (compare 207 | (eq? n (immediate 0 'fixnum)) 208 | (immediate 1 'bool) 209 | (immediate 0 'bool)) 210 | (immediate 0 'bool)) 211 | (call 212 | (mref even? (word 0)) 213 | even? 214 | (- n (immediate 1 'fixnum))) 215 | (immediate 0 'bool))))) 216 | (lengthL 217 | (λ (c x) 218 | (let ((length (mref c (immediate 0 'fixnum)))) 219 | (let ((s (word 1))) 220 | (let ((empty?«4487» (alloc s 'procedure))) 221 | (begin 222 | (begin 223 | (mset! empty?«4487» (word 0) empty?L)) 224 | (compare 225 | (eq? 226 | (compare 227 | (tag-eq? x 'pair) 228 | (immediate 1 'bool) 229 | (immediate 0 'bool)) 230 | (immediate 0 'bool)) 231 | (immediate -1 'fixnum) 232 | (compare 233 | (eq? 234 | (call 235 | (mref empty?«4487» (word 0)) 236 | empty?«4487» 237 | x) 238 | (immediate 0 'bool)) 239 | (+ 240 | (immediate 1 'fixnum) 241 | (call 242 | (mref length (word 0)) 243 | length 244 | (mref x (word 1)))) 245 | (immediate 0 'fixnum))))))))) 246 | (empty?L 247 | (λ (c x) 248 | (let () 249 | (compare 250 | (eq? x (immediate 0 'empty)) 251 | (immediate 1 'bool) 252 | (immediate 0 'bool)))))) 253 | (let ((x1 (alloc (word 1) 'box))) 254 | (begin 255 | (mset! x1 (word 0) (immediate 0 'fixnum)) 256 | (let ((x x1)) 257 | (let ((s1 (word 2))) 258 | (let ((fact«8489» (alloc s1 'procedure))) 259 | (begin 260 | (begin 261 | (mset! fact«8489» (word 0) factL) 262 | (mset! fact«8489» (word 1) fact«8489»)) 263 | (let ((s2 (word 2)) (s3 (word 2))) 264 | (let ((even?«8742» (alloc s2 'procedure)) 265 | (odd?«8743» (alloc s3 'procedure))) 266 | (begin 267 | (begin 268 | (mset! even?«8742» (word 0) even?L) 269 | (mset! even?«8742» (word 1) odd?«8743»)) 270 | (begin 271 | (mset! odd?«8743» (word 0) odd?L) 272 | (mset! odd?«8743» (word 1) even?«8742»)) 273 | (begin 274 | (compare 275 | (eq? 276 | (call 277 | (mref even?«8742» (word 0)) 278 | even?«8742» 279 | (call 280 | (mref fact«8489» (word 0)) 281 | fact«8489» 282 | (immediate 5 'fixnum))) 283 | (immediate 0 'bool)) 284 | (mset! 285 | x 286 | (word 0) 287 | (let ((x (alloc (word 2) 'pair))) 288 | (begin 289 | (mset! 290 | x 291 | (word 0) 292 | (immediate 2 'fixnum)) 293 | (mset! 294 | x 295 | (word 1) 296 | (immediate 0 'empty)) 297 | x))) 298 | (let ((s4 (word 2))) 299 | (let ((length«12165» 300 | (alloc s4 'procedure))) 301 | (begin 302 | (begin 303 | (mset! 304 | length«12165» 305 | (word 0) 306 | lengthL) 307 | (mset! 308 | length«12165» 309 | (word 1) 310 | length«12165»)) 311 | (mset! 312 | x 313 | (word 0) 314 | (call 315 | (mref length«12165» (word 0)) 316 | length«12165» 317 | (let ((x 318 | (alloc (word 2) 'pair))) 319 | (begin 320 | (mset! 321 | x 322 | (word 0) 323 | (immediate 1 'fixnum)) 324 | (mset! 325 | x 326 | (word 1) 327 | (immediate 0 'empty)) 328 | x)))))))) 329 | (mref x (word 0))))))))))))))))) 330 | -------------------------------------------------------------------------------- /anf.rkt: -------------------------------------------------------------------------------- 1 | #lang racket/base 2 | 3 | (require 4 | redex/reduction-semantics 5 | racket/syntax 6 | "lambda-s.rkt" 7 | "lambda-a.rkt") 8 | 9 | (provide (all-defined-out)) 10 | 11 | (set-cache-size! 1000) 12 | (check-redundancy #t) 13 | (current-cache-all? #t) 14 | 15 | ; Design pattern for a multi-language with syntactic distinction between source 16 | ; and target, but also a combined syntax. 17 | (define-union-language tagANFL (S. λiL-eval) (A. λaL-eval)) 18 | (define-extended-language ANFL tagANFL 19 | ; NOTE: Hacks to get type setting to work 20 | [A.x ::= .... ] 21 | [S.x ::= .... ] 22 | 23 | ;; Multi-language 24 | [A.v ::= .... (AS S.e)] 25 | [A.n ::= .... (AS S.e)] 26 | [A.e ::= .... (AS S.e)] 27 | [S.e ::= .... (SA A.e)] 28 | [x ::= S.x A.x] 29 | [primop ::= A.primop S.primop] 30 | [e ::= S.e A.e] 31 | 32 | #;[T ::= (in-hole C (AS A.Cm))] 33 | [T ::= (in-hole C (AS hole)) 34 | (in-hole C (lambda (x ...) (AS hole)))] 35 | 36 | 37 | [C ::= A.Cv] 38 | 39 | [S ::= S.env] 40 | 41 | ; For display 42 | [H ::= S.env] 43 | [S.Γ ::= ∅] 44 | [A.Γ ::= ∅] 45 | [A.τ ::= ∅] 46 | [S.τ ::= ∅]) 47 | 48 | (define-metafunction ANFL 49 | [(non-Cn? any) 50 | ,(not (redex-match ANFL A.Cn (term any)))]) 51 | 52 | (define-metafunction ANFL 53 | [(non-Cm? any) 54 | ,(not (redex-match ANFL A.Cm (term any)))]) 55 | 56 | (define-metafunction ANFL 57 | [(non-Tv? any) 58 | ,(not (redex-match ANFL A.v (term any)))]) 59 | 60 | (define anf-> 61 | (reduction-relation 62 | ANFL 63 | ; #:domain S.e 64 | ; #:codomain S.e 65 | ;; One of these tests fail due to an apparent bug: 66 | ; reduction-relation: relation not defined for (letrec ((fact«711» (λ (n«615») (let ((x (eq? n«615» 0))) (if x 1 (let ((x1 (- n«615» 1))) (let ((x2 (fact«711» x1))) (* n«615» x2)))))))) (fact«711» 5)) 67 | ; > (redex-match? ANFL S.e '(letrec ((fact«711» (λ (n«615») (let ((x (eq? n«615» 0))) (if x 1 (let ((x1 (- n«615» 1))) (let ((x2 (fact«711» x1))) (* n«615» x2)))))))) (fact«711» 5))) 68 | ; #t 69 | #:arrow -->a 70 | 71 | (-->a A.e (SA A.e) "A-normal") 72 | 73 | (-->a 74 | (in-hole S.E (let ([A.x S.e] ...) S.e_2)) 75 | (SA (let ([A.x (AS S.e)] ...) (AS (in-hole S.E S.e_2)))) 76 | (side-condition (not (equal? (term hole) (term S.E)))) 77 | "A-merge-l") 78 | 79 | (-->a 80 | (in-hole S.E (begin S.e_r ... S.e)) 81 | (SA (begin (AS S.e_r) ... (AS (in-hole S.E S.e)))) 82 | (side-condition (not (equal? (term hole) (term S.E)))) 83 | "A-merge-b") 84 | 85 | (-->a 86 | (in-hole S.E (letrec ([A.x (λ any S.e_1)] ...) S.e)) 87 | (SA (letrec ([A.x (λ any (AS S.e_1))] ...) (AS (in-hole S.E S.e)))) 88 | "A-merge-r") 89 | 90 | (-->a 91 | (in-hole S.E (if A.v S.e_1 S.e_2)) 92 | (SA (letrec ([j (λ (x) (in-hole S.E x))]) 93 | (if A.v (AS (j S.e_1)) (AS (j S.e_2))))) 94 | (fresh j) 95 | (fresh x) 96 | (side-condition (term (non-Cn? S.E))) 97 | "A-join") 98 | 99 | (-->a 100 | (in-hole A.Cm (if A.v S.e_1 S.e_2)) 101 | (SA (if A.v (AS S.e_1) (AS S.e_2))) 102 | (side-condition (not (equal? (term hole) (term S.E)))) 103 | "A-merge-if") 104 | 105 | (-->a (in-hole S.E A.n) (SA (let ([x A.n]) (AS (in-hole S.E x)))) 106 | "A-lift" 107 | (fresh x) 108 | ; Optimizations 109 | ; TODO: This optimization can be enabled for "predicates"? 110 | #;(side-condition 111 | (not (redex-match? ANFL (in-hole E_1 (if hole e_1 e_2)) (term E)))) 112 | #;(side-condition 113 | (not (redex-match? ANFL (in-hole S.E_1 (begin A.n ... hole S.e ... S.e_2)) (term S.E)))) 114 | ; Termination conditions 115 | #;(where (S.E_!_1 S.E_!_1) (hole S.E)) 116 | #;(side-condition 117 | (not (redex-match? ANFL (in-hole S.E_1 (let ([A.x_1 A.n_1] ... [S.x_2 hole] [S.x_3 S.e_3] ...) S.e_2)) (term S.E)))) 118 | (side-condition 119 | (term (non-Cn? S.E))) 120 | (side-condition 121 | (term (non-Tv? A.n)))))) 122 | 123 | (define st-> 124 | (reduction-relation 125 | ANFL 126 | ;#:domain A.e 127 | ;#:codomain A.e 128 | #:arrow -->st 129 | 130 | (-->st (in-hole C (AS (SA A.e))) (in-hole C A.e) "Boundary-1") 131 | (-->st (in-hole C (SA (AS S.e))) (in-hole C S.e) "Boundary-2"))) 132 | 133 | (define anf->+ 134 | (union-reduction-relations 135 | (context-closure anf-> ANFL T) 136 | st->)) 137 | 138 | (define (maybe-apply-reduction-relation r e) 139 | (with-handlers ([values (lambda _ #f)]) 140 | (apply-reduction-relation r e))) 141 | 142 | (define-judgment-form ANFL 143 | #:mode (not-anf->+j I) 144 | 145 | [(where (#f) ,(maybe-apply-reduction-relation anf->+ (term e_1))) 146 | ------------------- 147 | (not-anf->+j e_1)]) 148 | 149 | (define-judgment-form ANFL 150 | #:mode (anf->j I O) 151 | 152 | [(where (e_p ... e e_r ...) ,(maybe-apply-reduction-relation anf-> (term e_1))) 153 | ------------------- 154 | (anf->j e_1 e)]) 155 | 156 | (define-judgment-form ANFL 157 | #:mode (st->j I O) 158 | 159 | [(where (e_p ... e e_r ...) ,(maybe-apply-reduction-relation st-> (term e_1))) 160 | ------------------- 161 | (st->j e_1 e)]) 162 | 163 | (define-judgment-form ANFL 164 | #:mode (anf->+j I O) 165 | 166 | [(anf->j e_1 e) 167 | ------------------- 168 | (anf->+j (in-hole T e_1) (in-hole T e))] 169 | 170 | [(st->j e_1 e) 171 | ---------------- 172 | (anf->+j e_1 e)]) 173 | 174 | (define-judgment-form ANFL 175 | #:mode (anf->*j I O) 176 | 177 | [(where (e) ,(apply-reduction-relation* anf->+ (term e_1))) 178 | ------------------- 179 | (anf->*j e_1 e)]) 180 | 181 | (define-judgment-form ANFL 182 | #:mode (anf-compile I O) 183 | 184 | [(anf->*j (AS S.e) A.e) (not-anf->+j A.e) 185 | --------------------- 186 | (anf-compile S.e A.e)]) 187 | 188 | (define-judgment-form ANFL 189 | #:mode (λi->j I O) 190 | 191 | [(where ((H_2 S.e_2)) ,(maybe-apply-reduction-relation λi-> (term (H_1 S.e_1)))) 192 | ------------------- 193 | (λi->j (H_1 S.e_1) (H_2 S.e_2))]) 194 | 195 | (define-judgment-form ANFL 196 | #:mode (λi->j* I O) 197 | 198 | [------------------- 199 | (λi->j* (H_1 S.e_1) (H_1 S.e_1))] 200 | 201 | [(λi->j (H_1 S.e_1) (H_2 S.e_2)) 202 | (λi->j* (H_1 S.e_1) (H_3 S.e_3)) 203 | ------------------- 204 | (λi->j* (H_1 S.e_1) (H_3 S.e_3))]) 205 | 206 | (define-judgment-form ANFL 207 | #:mode (λa->j I O) 208 | 209 | [(where ((H e)) ,(maybe-apply-reduction-relation λa-> (term any_1))) 210 | ------------------- 211 | (λa->j any_1 (H e))]) 212 | 213 | (define-judgment-form ANFL 214 | #:mode (λa->j* I O) 215 | 216 | [------------------- 217 | (λa->j* (H_1 S.e_1) (H_1 S.e_1))] 218 | 219 | [(λa->j (H_1 S.e_1) (H_2 S.e_2)) 220 | (λa->j* (H_1 S.e_1) (H_3 S.e_3)) 221 | ------------------- 222 | (λa->j* (H_1 S.e_1) (H_3 S.e_3))]) 223 | 224 | (define-judgment-form ANFL 225 | #:mode (anf-eval->+ I O) 226 | 227 | [(λi->j (H_1 S.e_1) (H_2 S.e_2)) 228 | ----------------------------- "S-Interp" 229 | (anf-eval->+ (H_1 S.e_1) (H_2 S.e_2))] 230 | 231 | [(λi->j (H_1 S.e_1) (H_2 S.e_2)) 232 | ----------------------------- "AS-Interp" 233 | (anf-eval->+ (H_1 (AS S.e_1)) (H_2 (AS S.e_2)))] 234 | 235 | [(λa->j (H_1 A.e_1) (H_2 A.e_2)) 236 | ----------------------------- "A-Run" 237 | (anf-eval->+ (H_1 A.e_1) (H_2 A.e_2))] 238 | 239 | [(λa->j (H_1 A.e_1) (H_2 A.e_2)) 240 | ----------------------------- "SA-Run" 241 | (anf-eval->+ (H_1 (SA A.e_1)) (H_2 (SA A.e_2)))] 242 | 243 | [(anf->+j A.e_1 A.e_2) 244 | ;; TODO: Need to be able to translate the heap. 245 | ----------------------------- "JIT" 246 | (anf-eval->+ (H_1 A.e_1) (H_1 A.e_2))]) 247 | 248 | (define-judgment-form ANFL 249 | #:mode (anf-eval->* I O) 250 | 251 | [(anf-eval->+ (H_1 e_1) (H_2 e_2)) 252 | (anf-eval->* (H_2 e_2) (H_3 e_3)) 253 | ----------------------------- 254 | (anf-eval->* (H_1 e_1) (H_3 e_3))] 255 | 256 | [----------------------------- 257 | (anf-eval->* (H e) (H e))]) 258 | 259 | (define-judgment-form ANFL 260 | #:mode (aot-normalize I O) 261 | #:contract (aot-normalize S.e A.e) 262 | 263 | [(where (A.e) ,(apply-reduction-relation* anf->+ (term (AS S.e)))) 264 | ----------------------- "AOT" 265 | (aot-normalize S.e A.e)]) 266 | 267 | (define-metafunction ANFL 268 | compile-anf : S.e -> A.e 269 | [(compile-anf S.e) 270 | A.e 271 | (where (A.e) ,(apply-reduction-relation* anf->+ (term (AS S.e))))]) 272 | 273 | (define (step n x) 274 | (if (zero? n) 275 | x 276 | (step (sub1 n) (car (apply-reduction-relation anf->+ x))))) 277 | 278 | (module+ test 279 | (parameterize ([default-language λaL]) 280 | (test-->> 281 | anf->+ 282 | #:equiv alpha-equivalent? 283 | (term 284 | (AS (letrec ([fact (λ (n) 285 | (if (eq? n 0) 286 | 1 287 | (* n (fact (- n 1)))))]) 288 | (fact 5)))) 289 | (term 290 | (letrec ([fact (λ (n) 291 | (let ([x (eq? n 0)]) 292 | (if x 293 | 1 294 | (let ([x (- n 1)]) 295 | (let ([x2 (fact x)]) 296 | (* n x2))))))]) 297 | (fact 5)))) 298 | 299 | (test-->> 300 | anf->+ 301 | #:equiv alpha-equivalent? 302 | (term (AS ((if ((x 5) 4) meow bark) 5 2))) 303 | 304 | (term 305 | (let ((x1 (x 5))) 306 | (let ((x3 (x1 4))) 307 | (letrec ((j (λ (x2) (x2 5 2)))) 308 | (if x3 309 | (j meow) 310 | (j bark))))))) 311 | (test-->> 312 | anf->+ 313 | #:equiv alpha-equivalent? 314 | (term (AS (+ (if (let ([x #t]) x) 6 7) 1))) 315 | 316 | (term 317 | (let ([x #t]) 318 | (letrec ([j (λ (x) (+ x 1))]) 319 | (if x (j 6) (j 7)))))) 320 | 321 | (test-judgment-holds 322 | (anf-eval->* (() (AS (+ (if (let ([x #t]) x) 6 7) 1))) 323 | (_ 7))) 324 | 325 | (test-judgment-holds 326 | (anf-eval->* (() (AS (+ (if (let ([x #t]) x) 6 7) 1))) 327 | (() (let ([x_2 #t]) 328 | (letrec ([x_3 (λ (x_1) (+ x_1 1))]) 329 | (if x_2 (x_3 6) (x_3 7))))))) 330 | 331 | #;(test-->> 332 | cc->+ 333 | #:equiv alpha-equivalent? 334 | (term s-eg) 335 | (term 336 | (let ((x (box 0))) 337 | (pletrec 338 | ((factL 339 | (λ (c n) 340 | (let ((fact (closure-ref c 0))) 341 | (if (eq? n 0) 342 | 1 343 | (* n (apply-closure fact fact (- n 1)))))))) 344 | (cletrec 345 | ((fact (closure factL fact))) 346 | (pletrec 347 | ((even?L 348 | (λ (c n) 349 | (let ((odd? (closure-ref c 0))) 350 | (if (eq? n 0) 351 | #t 352 | (apply-closure odd? odd? (- n 1)))))) 353 | (odd?L 354 | (λ (c n) 355 | (let ((even? (closure-ref c 0))) 356 | (if (eq? n 0) 357 | #f 358 | (apply-closure even? even? (- n 1))))))) 359 | (cletrec 360 | ((even? (closure even?L odd?)) 361 | (odd? (closure odd?L even?))) 362 | (begin 363 | (if (apply-closure 364 | even? 365 | even? 366 | (apply-closure fact fact 5)) 367 | (pletrec 368 | ((lengthL 369 | (λ (c x) 370 | (let ((length (closure-ref c 0))) 371 | (pletrec 372 | ((empty?L (λ (c x) (let () (eq? x '()))))) 373 | (cletrec 374 | ((empty? (closure empty?L))) 375 | (if (pair? x) 376 | (if (apply-closure empty? empty? x) 377 | 0 378 | (+ 379 | 1 380 | (apply-closure 381 | length 382 | length 383 | (cdr x)))) 384 | -1))))))) 385 | (cletrec 386 | ((length (closure lengthL length))) 387 | (set-box! 388 | x 389 | (apply-closure length length (cons 1 '()))))) 390 | (set-box! x (cons 2 '()))) 391 | (unbox x))))))))))) 392 | 393 | ;; TODO: Add reduction relations, do union-reduction-relation, and show simulation. 394 | -------------------------------------------------------------------------------- /sketch.md: -------------------------------------------------------------------------------- 1 | # Abstract 2 | System F to Typed Assembly Language is the standard reference for compiler 3 | correctness. 4 | This is unfortunate, since this was not what the paper was designed to 5 | accomplish. 6 | System F is a peculiar choice of language, the paper does not discuss many 7 | important issues of correctness or optimization, it lacks many formal details, 8 | and does not compare to many alternative design choices. 9 | This makes sense, since the paper was about type-preserving compilation. 10 | Unfortunately, it makes a poor standard reference. 11 | 12 | In this little document, I explore a not-necessarily-typed translation from a 13 | lambda calculus to an assembly language. 14 | I then layer a type system over the translations, to show they can also be type 15 | preserving. 16 | I briefly explore several alternative options and some implications, but do not 17 | explore them fully. 18 | 19 | The following came to me in a dream, resulting from thinking about teaching 20 | introduction to compiler construction for a million hours a week. 21 | 22 | # Sketch 23 | Most of the compiler correctness literature focuses on two issues: 24 | control flow (ANF and CPS), and higher-order functions (closure conversion). 25 | Typically, we first perform ANF or CPS, then closure conversion. 26 | This is because ANF and CPS can introduce new closures. 27 | Next, the compiler will specify representation of various data types, manually 28 | allocating heap data structures, etc. 29 | Finally, the compiler performs code generation. 30 | 31 | In short, the standard model is 32 | 33 | 1. CPS Translation 34 | 2. Closure Conversion 35 | 3. Hoist Procedures 36 | 4. Explicit Allocation 37 | 5. Code generation 38 | 39 | Interestingly, this model differs significantly from the performance-oriented 40 | Chez Scheme compiler, which uses a model somewhat like: 41 | 42 | 1. Closure Conversion 43 | 2. Hoist Procedures 44 | 3. Explicit Allocation 45 | 4. Monadic Form Transformation 46 | 5. Code generation 47 | 48 | CPS and ANF are avoided, and monadic form is used instead. 49 | Contrary to arguments about the benefits of CPS and ANF in the compiler 50 | correctness literature, monadic form offers apparent advantages in optimization. 51 | For example, compared to ANF, code duplication is avoided by making the code 52 | generator aware of monadic patterns. 53 | No join points need to be introduced, avoiding additional closure indirection, 54 | and the cost of optimizing additional closures. 55 | 56 | More important, perhaps, by supporting algebraic expressions until quite late in 57 | the compiler intermediate languages, the design of several compiler passes is 58 | simplified. 59 | The monadic form translation itself is simpler than CPS or ANF. 60 | Explicit allocation in a production compiler is non-trivial and involves many 61 | intermediate computations, and algebraic expressions simplify its implementation. 62 | 63 | A sketch of the key translation steps follows: 64 | 65 | ## Closure Conversion 66 | Closure conversion assumes all functions bound in a `letrec` form. 67 | A simple transformation on the surface language can deal with elaborating 68 | anonymous lambdas. 69 | 70 | The target language binds all closed lambdas (procedures) in a `letrec` form, 71 | and binds all closures in a `letrec` form, as closures could be mutually 72 | recursive. 73 | For clarity, we name these `pletrec`, for procedure `letrec`, and `cletrec`, for 74 | closure `letrec`. 75 | Procedures now take an extra argument, the closure, and dereference previously 76 | free variables from the closure. 77 | The representation of closure is kept abstract. 78 | 79 | Binding procedures in a `letrec` form is odd, since the procedures are now closed. 80 | However, further optimization may inline closures, and thus free references to 81 | other procedures. 82 | We use a `letrec` form to enable this. 83 | 84 | Distinguishing the `pletrec` and `cletrec` encodes the semantic distinction 85 | between the two in syntax, simplifying the next transformation. 86 | 87 | ``` 88 | (letrec ([f (λ (x ...) e)] ...) 89 | body) 90 | -> 91 | (pletrec ([f$label (λ (c x ...) 92 | (let ([fx0 (closure-ref cp 0)] 93 | ... 94 | [fxn (closure-ref cp n)]) 95 | e))] ...) 96 | (cletrec ([f (closure f$label fx0 ... fxn)] ...) 97 | body)) 98 | ``` 99 | 100 | Application is translated simply by passing the applying the closure to itself 101 | and its formal parameters. 102 | ``` 103 | (e e1 ... en) 104 | -> 105 | (e e e1 ... en)) 106 | ``` 107 | The semantics of application is applying the procedure at the label of the 108 | closure. 109 | 110 | ## Hoisting 111 | Hoisting lifts all, now closed, procedures to the top-level. 112 | Syntactically, this amounts to hoisting and merging all `pletrec` forms. 113 | 114 | We can specify this as the following rewrite system: 115 | ``` 116 | C[(letrec ([f (λ (x ...) e)] ...) 117 | body)] 118 | -> 119 | (pletrec ([f (λ (x ...) e)] ...) C[body]) 120 | 121 | 122 | (pletrec ([f (λ (x ...) e)] ...) 123 | (pletrec ([g (λ (x' ...) e')]) 124 | body)) 125 | -> 126 | (pletrec ([f (λ (x ...) e)] 127 | ... 128 | [g (λ (x' ...) e')]) 129 | body) 130 | ``` 131 | Hoisting is the transformation that produces the normal form for this rewrite 132 | system. 133 | 134 | ## Explicit Allocation 135 | The next pass explicitly allocates closures and structured data. 136 | In a production compiler, it is usually part of the pass that specifies the bit 137 | representation of all data structures, including immediate data. 138 | This involves tagging objects. 139 | Most compiler correctness paper ignore the details of object tagging. 140 | 141 | The transformation is straight-forward. 142 | Each data structure introduction form allocates a fresh pointer and initializes 143 | it. 144 | Each elimination form is transformed into a pointer dereference. 145 | We use `(mset! base offset e)` to represent setting the pointer at base `base` 146 | and offset `offset` to the value of `e`, and `(mref base offset)` to dereferene 147 | a pointer. 148 | The form `(allocate bytes)` allocates a number of bytes, returning the pointer to 149 | the base of the allocated memory. 150 | I use Scheme-style `unquote` to indicate computations that we can stage, i.e., 151 | those that depend only on static values. 152 | 153 | ``` 154 | (closure label fx0 ... fxn) 155 | -> 156 | (let ([c (allocate ,(* word-size (+ 1 n))))]) 157 | (begin 158 | (mset! c ,(* 0 word-size) label) 159 | (mset! c ,(* 1 word-size) fx0) 160 | ... 161 | (mset! c ,(* n word-size) fxn) 162 | c)) 163 | ``` 164 | 165 | ``` 166 | (cons e1 e2) 167 | -> 168 | (let ([c (allocate ,(* word-size 2))]) 169 | (begin 170 | (mset! c ,(* 0 word-size) e1) 171 | (mset! c ,(* 1 word-size) fx0) 172 | c)) 173 | ``` 174 | 175 | ``` 176 | (car e) 177 | -> 178 | (mref e ,(* 0 word-size)) 179 | ``` 180 | 181 | ``` 182 | (cdr e) 183 | -> 184 | (mref e ,(* 1 word-size)) 185 | ``` 186 | 187 | ``` 188 | #f -> 0 189 | ``` 190 | 191 | ``` 192 | (if e1 e2 e3) -> (if (neq? e1 0) e2 e3) 193 | ``` 194 | 195 | From the above translation, it's not obvious why algebraic expressions would 196 | simplify the translation. 197 | The computed offsets are all constant (although we have expressed them as 198 | equations), and if the source operands were values, so would the target opands 199 | be. 200 | This is because, in keeping with compiler correctness literature, we have 201 | ignored object tagging and complex data types. 202 | The problem is simple to see when we add a data type like vectors or arrays: 203 | 204 | ``` 205 | (make-vector e) 206 | -> 207 | (let ([c (allocate (* ,word-size e))]) 208 | (begin 209 | (mset! c ,(* 0 word-size) e) 210 | (init-vector c e) 211 | c)) 212 | ``` 213 | where `init-vector` initializes the vector fields to 0. 214 | 215 | Now, we must compute the number of bytes to allocate dynmically. 216 | We could express this in a CPS or ANF language, but would have to bind the 217 | computation `(* word-size e)` to an auxiliary name. 218 | This needless complicates part of the translation. 219 | 220 | The situation is worse when we consider adding object tags. 221 | For eample, when we add taging, the translation of `cons` and `car` become: 222 | 223 | ``` 224 | (cons e1 e2) 225 | -> 226 | (let ([c (bitwise-ior (allocate ,(* word-size 2)) ,pair-tag)]) 227 | (begin 228 | (mset! (bitwise-and c ,pair-mask) ,(* 0 word-size) e1) 229 | (mset! (bitwise-and c ,pair-mask) ,(* 1 word-size) fx0) 230 | c)) 231 | ``` 232 | 233 | ``` 234 | (car e) 235 | -> 236 | (mref (bitwise-and e ,pair-mask) ,(* 0 word-size)) 237 | ``` 238 | 239 | We tag the pointer by adding a tag using bitwise inclusive or. 240 | To reference the correct address, we mask off the tag bits using bitwise and. 241 | Again, this is simplifies by algebraic expressions. 242 | While the few example seems easy, a real compiler has many data types to manage, 243 | and the compiler writer does not want to manually program in ANF or CPS to 244 | manage these auxiliary computations. 245 | The compiler should compile those for them. 246 | 247 | ## Monadic Form Transformation 248 | Now that most* (many things happened in the production compiler that are not of 249 | particular interest for the model compiler) of the compiler is done, we can make 250 | precise the order of evaluation. 251 | We use a monadic form transformation. 252 | 253 | I give a few representative cases. 254 | I use semantic brackets to indicate the translation as a meta-function on syntax. 255 | 256 | ``` 257 | 〚(e1 e2)〛= (let ([x1 〚e1〛]) 258 | (let ([x2 〚e2〛]) 259 | (x1 x2))) 260 | ``` 261 | 262 | ``` 263 | 〚(if e e1 e2)〛= (let ([x 〚e〛]) 264 | (if x 〚e1〛〚e2〛)) 265 | ``` 266 | 267 | ``` 268 | 〚(let ([x e] ...) body)〛= (let ([x 〚e〛] ...) 269 | 〚body〛) 270 | ``` 271 | 272 | ``` 273 | 〚(mset! e1 e2 e3)〛= (let ([x1 〚e1〛]) 274 | (let ([x2 〚e2〛]) 275 | (let ([x3 〚e3〛]) 276 | (mset! x1 x2 x3)))) 277 | ``` 278 | 279 | The monadic transformation, as is plain to see, is considerably simpler than a 280 | CPS translation or ANF translation. 281 | Unlike CPS, the translation itself does not necessarily fix a call-by-value or 282 | call-by-name evaluation strategy; this can be determined by a later pass that 283 | implements the semantics of `let`. 284 | Unlike ANF, the translation does not duplicate code (even without considering 285 | join points), and the translation does not require the compiler writer to write 286 | with procedural accumulators or return lists of bindings. 287 | 288 | The translation is obviously compositional, unlike CPS and ANF. 289 | 290 | Some arguments against the use of monadic form, including one by this very 291 | author, are that the machine for the monadic intermdiate language (MIL) requires 292 | more stack usage (in its abstract machine) than ANF or CPS. 293 | This is a red-herring. 294 | While true at the intermediate language level, the compiler writer does not care 295 | about IL stack usage, since only the target language will use stack. 296 | The code generate can do a fine job optimizing monadic form; even better, it 297 | seems, than ANF or CPS. 298 | 299 | ## Code Generation. 300 | In the production compiler, a lot takes place between monadic form and code gen. 301 | In the model, we don't care about these things. 302 | 303 | Normally, code generation is considered as one step. 304 | We break it into a couple steps. 305 | First we lower into an imperative language with registers and `if`. 306 | Then we deal with commuting conversion. 307 | The commuting conversions are necessary due to the remaining nesting from 308 | monadic normal form. 309 | Finally, we perform code gen, remove `if`, generating labels and jumps, and end 310 | in a TAL-esque target. 311 | 312 | We follow TAL and assume infinite registers. 313 | 314 | ### lower 315 | ``` 316 | 〚(if pred e1 e2)〛= (if pred 〚e1〛 〚e2〛) 317 | ``` 318 | 319 | ``` 320 | 〚(let ([x e] ...) body)〛= (begin (set! r_x 〚e〛) ... 〚body〛) 321 | ``` 322 | 323 | ### commute 324 | We specify the commuting conversions as a system of rewrites again: 325 | ``` 326 | (set! x (if pred e1 e2)) -> (if pred (set! x e1) (set! x e2)) 327 | ``` 328 | 329 | ``` 330 | (set! x (begin c ... body)) = (begin c ... (set! x body)) 331 | ``` 332 | 333 | Notice that this transformation would have duplicates code if performed at a 334 | higher-level of abstraction. 335 | 336 | ``` 337 | (let ([x (if pred e1 e2)]) body) -> (if pred (let ([x e1]) body) (let ([x e2]) body)) 338 | ``` 339 | 340 | It's the ability to use mutable global variables (registers) that enables the 341 | optimization. 342 | This suggests we can recover the optimization at a higher level of abstraction 343 | using mutable state, simplifying code generation. 344 | 345 | This may be more difficult to prove compositionally correct or secure. 346 | We would need to introduce abstraction boundaries to avoid leaking the value of 347 | the global variable previously hidden by lexical scope. 348 | Well-bracketed state may allow proving this compositionally correct or secure. 349 | 350 | It's possible to avoid the code duplication using a join-point, ala ANF, or via 351 | continuations, in CPS: 352 | ``` 353 | (let ([x (if pred e1 e2)]) body) -> (let ([k (lambda (x) body)]) (if pred (k e1) (k e2))) 354 | ``` 355 | 356 | However, this necessarily creates a closures, which may be difficult to optimize 357 | away, and at the very least costs time to optimize. 358 | 359 | ### code gen 360 | ``` 361 | 〚(if (,cmp v1 v2) e1 e2)〛= (begin (compare v1 v2) (jump-if cmp L1) (with-label L2 〚e2〛) (with-label L1 〚e1〛)) 362 | ``` 363 | 364 | ``` 365 | 〚(pletrec ([x (λ () e)] ...) body)〛= (begin (with-label main 〚body〛) (with-label x 〚e〛) ...) 366 | ``` 367 | -------------------------------------------------------------------------------- /prisc2021-abstract.scrbl: -------------------------------------------------------------------------------- 1 | #lang scribble/acmart @sigplan @screen 2 | 3 | @title{Compilation as Multi-Language Semantics} 4 | @(require 5 | (only-in pict vc-append vl-append hc-append) 6 | "bib.rkt" 7 | scriblib/footnote 8 | latex-utils/scribble/theorem 9 | latex-utils/scribble/utils 10 | "lambda-s.rkt" 11 | "lambda-a.rkt" 12 | "anf.rkt" 13 | "defs.rkt") 14 | 15 | @author[ 16 | #:orcid "0000-0002-6402-4840" 17 | #:affiliation 18 | @affiliation[ 19 | #:institution "University of British Columbia" 20 | #:city "Vancouver" 21 | #:state "BC" 22 | #:country "CA" 23 | ] 24 | #:email (email "wjb@williamjbowman.com") 25 | ]{William J. Bowman} 26 | 27 | @;acmConference["Workshop on Principles of Secure Compilation (PriSC)" "January 17--22, 2021" "Online"] 28 | @acmYear{2021} 29 | @acmMonth{1} 30 | 31 | @;{abstract 32 | Modeling interoperability between programs in different languages is a key 33 | problem when modeling compositional and secure compilation, which has been 34 | successfully addressed using multi-language semantics. 35 | Unfortunately, existing models of compilation using multi-language semantics 36 | define two variants of each compiler pass are defined: a syntactic translation 37 | on open terms, and a run-time translation of closed terms at multi-language 38 | boundaries 39 | 40 | We introduce a novel work-in-progress approach to uniformly model a compiler 41 | entirely as a reduction system on open term in a multi-language semantics, 42 | rather than as a syntactic translation. 43 | This simultaneously defines the compiler and the interoperability semantics, 44 | reducing duplication. 45 | It also provides interesting semantic insights. 46 | Normalization of the cross-language redexes performs ahead-of-time (AOT) 47 | compilation. 48 | Evaluation in the multi-language models just-in-time (JIT) compilation. 49 | Confluence of multi-language reduction implies compiler correctness. 50 | Subject reduction of the multi-language reduction implies type-preservation of 51 | the compiler. 52 | This model provides a strong attacker model through contextual equivalence, 53 | retaining its usefulness for modeling secure compilation as full abstraction. 54 | } 55 | 56 | @section{Extended Abstract} 57 | Modeling interoperability between programs in different languages is a key 58 | problem when modeling compositional and secure compilation. 59 | Multi-language semantics provide a syntactic method for modeling language 60 | interopability@~cite{matthews2007}, and has proven useful in compiler 61 | correctness and secure compilation@~cite["ahmed2011" "perconti2014" 62 | "ahmed2015:snapl" "new2016" "patterson2017:linkingtypes"]. 63 | 64 | Unfortunately, existing models of compilation using multi-language semantics 65 | duplicate effort. 66 | Two variants of each compiler pass are defined: a syntactic translation on open 67 | terms, and a run-time translation of closed terms at multi-language 68 | boundaries@~cite["ahmed2011" "new2016"]. 69 | One must then prove that both definitions coincide. 70 | 71 | We introduce a novel work-in-progress approach to uniformly model both variants 72 | as a single reduction system on open terms in a multi-language semantics. 73 | This simultaneously defines the compiler and the interoperability semantics. 74 | It also has interesting semantic consequences: different reduction 75 | strategies model different compilation strategies, and standard theorems about 76 | reduction imply standard compiler correctness theorems. 77 | For example, we get a model of ahead-of-time (AOT) compilation by normalizing 78 | cross-language redexes; the normal form with respect to these redexes is a 79 | target language term. 80 | We model just-in-time (JIT) compilation as nondeterministic evaluation in the 81 | multi-language: a term can either step in the source, or translate then step in 82 | the target. 83 | We prove that confluence of multi-language reduction implies compiler 84 | correctness and part of full abstraction; and that subject reduction 85 | implies type-preservation of the compiler. 86 | 87 | @;The model also retains properties valued for compiler construction and validation. 88 | @;Reduction systems compose easily, ensuring vertical composition of 89 | @;separate passes. 90 | @;Strong horizontal composition is enabled easily by embedding in the 91 | @;multi-language and syntactic linking. 92 | 93 | @;In this talk, I'll describe part of a compiler from a Scheme-like language to an 94 | @;x64-64-like language designed completely as a series of multi-language 95 | @;semantics. 96 | @;I'll focus on a single pass to describe the approach to designing a compiler as 97 | @;multi-language reduction, and formalize several definitions derived from the 98 | @;multi-language semantics. 99 | 100 | @subsubsub*section{An example instance: Reduction to A-normal form} 101 | @;@figure["fig:src-syntax" @elem{@|source-lang| Syntax} 102 | @; (render-language λiL #:nts '(e x tag-pred arith-op)) 103 | @;] 104 | 105 | Our approach generalizes from high-level to low-level transformations of a 106 | wide array of language features. 107 | To demonstrate this, we have developed a 5-pass model compiler from a 108 | Scheme-like language to an x86-64-like language. 109 | Here, we model one interesting compiler pass: reduction to A-normal form (ANF). 110 | This pass is a good example and stress test. 111 | The A-reductions are tricky to define because they reorder a term with respect 112 | to its context, while the other passes locally transform a term in an arbitrary 113 | context. 114 | 115 | The source is a standard dynamically typed functional imperative language, 116 | modeled on Scheme. 117 | It has a call-by-value heap-based small-step semantics, 118 | @render-term[ANFL (λi->j (H S.e_1) (H S.e_2))], where @render-term[ANFL H] 119 | represents the heap and @render-term[ANFL S.e] is represents a source 120 | expression.@note{ 121 | We use a prefix followed by a dot (.) to distinguish terms in each 122 | language---the prefix @emph{S} for source terms and the prefix @emph{A} for ANF 123 | terms. 124 | } 125 | We omit the syntax and reduction rules for brevity. 126 | 127 | @;@figure["fig:anf-syntax" @elem{@|anf-lang| Syntax} 128 | @; (render-language λaL #:nts '(v n e)) 129 | @; (render-language λaL-eval #:nts '(E v)) 130 | @;] 131 | 132 | The target language is essentially the same, but the syntax is restricted to 133 | A-normal form: all computations @render-term[ANFL A.n] require values 134 | @render-term[ANFL A.v] as operands; expressions @render-term[ANFL A.e] 135 | cannot be nested and only explicitly compose and sequence intermediate 136 | computations @render-term[ANFL A.n]. 137 | The reduction relation, @render-term[ANFL (λa->j (H A.e_1) (H A.e_2))], 138 | does not require a control stack. 139 | 140 | @figure["fig:anf-multi-syn" @elem{@|anf-multi-lang| Syntax (excerpts)} 141 | (parameterize ([extend-language-show-union #t] 142 | [extend-language-show-extended-order #t]) 143 | (hc-append 40 144 | (render-language ANFL #:nts '(A.e A.n A.v)) 145 | (render-language ANFL #:nts '(S.e e)))) 146 | ] 147 | 148 | To develop a multi-language semantics, we embed syntactic terms from each 149 | language into a single syntax, defined in @Figure-ref{fig:anf-multi-syn}. 150 | We extend each meta-variable with boundary terms @render-term[ANFL (SA A.e)] 151 | (``Source on the outside, ANF on the inside'') and @render-term[ANFL (AS S.e)] 152 | (``ANF on the outside, Source on the inside''). 153 | 154 | @(require (only-in redex/pict render-reduction-relation-rules)) 155 | @figure["fig:a-red" @elem{The A-reductions (excerpts)} 156 | (parameterize ([render-reduction-relation-rules '("A-normal" "A-merge-l" "A-merge-b" "A-join" "A-lift")]) 157 | (render-reduction-relation anf-> #:style 'compact-vertical)) 158 | ] 159 | 160 | The translation to ANF can be viewed as a reduction system in the 161 | multi-language. 162 | @;In fact, the original presentation of ANF was as a reduction system, and this is 163 | @;where A-normal form derives its name---the normal form with respect to the 164 | @;A-reductions@~cite{flanagan1993}. 165 | We define the A-reductions in @Figure-ref{fig:a-red}. 166 | These rules are essentially standard@~cite{flanagan1993}, but we modify them to 167 | make boundary transitions explicit. 168 | The A-reductions have the form @render-term[ANFL (anf->j S.e S.e)], reducing 169 | source expressions in the multi-language. 170 | Each A-reduction rewrites a source expression in a source evaluation context, 171 | transforming the control stack into a data stack. 172 | For example, the A-lift rule lifts a trivial computation, let-binding it and 173 | providing the let-bound name (a value) in evaluation position, explicitly 174 | sequencing the computation @render-term[ANFL A.n] with the evaluation context 175 | @render-term[ANFL S.E]. 176 | The side-conditions syntactically encode termination conditions, preventing 177 | A-reductions of target redexes and in empty evaluation contexts. 178 | 179 | @figure["fig:anf-boundary-red" @elem{@|anf-multi-lang| Boundary Reductions} 180 | (render-reduction-relation st-> #:style 'horizontal) 181 | ] 182 | 183 | We supplement the multi-language A-reductions with the standard boundary 184 | cancellation reductions, given in @Figure-ref{fig:anf-boundary-red}. 185 | These apply under any multi-language context @render-term[ANFL C]. 186 | 187 | @figure["fig:anf-trans-red" @elem{@|anf-multi-lang| Translation Reductions} 188 | (hc-append 40 189 | (parameterize ([extend-language-show-union #t] 190 | [extend-language-show-extended-order #t]) 191 | (render-language ANFL #:nts '(T))) 192 | (with-paper-rewriters (render-judgment-form-rows anf->+j '(2)))) 193 | ] 194 | 195 | In @Figure-ref{fig:anf-trans-red} we define the translation reductions. 196 | These extend the A-reductions to apply under any translation context 197 | @render-term[ANFL T]. 198 | The construction of the translation context for ANF is a little unusual, but 199 | the intuition is simple: a translation context identifies a pure source 200 | expression under any context, including under a target/source boundary. 201 | The context @render-term[ANFL A.Cm] corresponds to an ANF context that can have 202 | any expression in the hole. 203 | In one step, the translation reductions can perform either one A-reduction 204 | or one boundary cancellation. 205 | 206 | From the translation reductions, we derive AOT compilation as 207 | normalization with respect to translation reductions. 208 | @mdef["ANF Compilation by Normalization"] 209 | @render-judgment-form[anf-compile] 210 | 211 | @figure["fig:anf-multi-red" @elem{@|anf-multi-lang| Multi-language Reduction}]{ 212 | @(with-paper-rewriters (render-judgment-form-rows anf-eval->+'(2 2 1))) 213 | } 214 | 215 | Finally, we define the multi-language semantics in 216 | @Figure-ref{fig:anf-multi-red}. 217 | This defines all possible transitions in the multi-language. 218 | A term can either take a step in the source language, or a translation step, or 219 | a step in the target language. 220 | Multi-language reduction is indexed by a heap, @render-term[ANFL H], which 221 | is used by the source and target reductions but not the translation reductions. 222 | 223 | Note that terms already in the heap are not translated, which corresponds to an 224 | assumption that the language memory models are identical. 225 | We could lift this restriction by adding multi-language boundaries to heap 226 | values and extending translation reductions to apply in the heap. 227 | 228 | The multi-language reduction allows reducing in the source, modeling 229 | interpretation, or translating then reducing in the target, modeling 230 | JIT compilation before continuing execution. 231 | This does not model speculative optimization; equipping the multi-language with 232 | assumption instructions as done by @citet{flueckiger2018:jit} might support 233 | modeling this. 234 | 235 | Standard meta-theoretic properites of reduction impliy standard compiler 236 | correctness results. 237 | 238 | Subject reduction of the multi-language semantics implies type-preservation of 239 | the compiler. 240 | This is simple for our present compiler, since the type system is simple, but 241 | the theorems applies for more complex type systems. 242 | 243 | @mthm[@elem{Subject Reduction implies Type Preservation} #:tag "thm:type-pres-type-pres"]{ 244 | If (@render-term[ANFL (ANFL-types Γ e_1 τ)] and @render-term[ANFL (anf->*j e_1 e_2)] 245 | implies @render-term[ANFL (ANFL-types Γ e_2 τ)]) then@exact{\\} 246 | (@render-term[ANFL (λiL-types S.Γ S.e S.τ)] and @render-term[ANFL (anf-compile S.e A.e)] implies 247 | ∃@render-term[ANFL A.Γ],@render-term[ANFL A.τ]. 248 | @render-term[ANFL (λaL-types A.Γ A.e A.τ)]). 249 | } 250 | 251 | We derive compiler correctness from confluence. 252 | 253 | @mconj[@elem{Confluence} #:tag "thm:anf:confluence"]{ 254 | If @render-term[ANFL (anf-eval->* (H e) (H_1 e_1))] and @exact{\\} 255 | @render-term[ANFL (anf-eval->* (H e) (H_2 e_2))] then 256 | @render-term[ANFL (anf-eval->* (H_1 e_1) (H_3 e_3))] and 257 | @render-term[ANFL (anf-eval->* (H_2 e_2) (H_3 e_3))] 258 | } 259 | 260 | Note the multi-language semantics can reduce open terms, so confluence implies 261 | correctness of both the AOT and the JIT compiler. 262 | As an example, whole-program correctness is a trivial corollary of confluence. 263 | 264 | @mcor[@elem{Whole-Program Correctness} #:tag "thm:anf:correct"]{ 265 | @exact{\mbox{}\\} 266 | If 267 | @render-term[ANFL (λi->j* (() S.e) (H S.v))] and 268 | @render-term[ANFL (anf-compile S.e A.e)] then 269 | @render-term[ANFL (λa->j* (() A.e) (H A.v))] such that 270 | @render-term[ANFL A.v] is equal to 271 | @render-term[ANFL S.v]. 272 | } 273 | 274 | Multi-language semantics provide a strong attacker model through contextual 275 | equivalence. 276 | A context @render-term[ANFL C] models an attacker that can provide 277 | either source or target code or data as input and observe the result. 278 | Contextual equivalence is extended to relate reduction configurations, not 279 | just terms, to enable the definition to apply to the JIT model. 280 | 281 | @mdef["Contextual Equivalence"]{ 282 | @render-term[ANFL (H_1 e_1)] @exact{$\mathrel{\approx}$} @render-term[ANFL (H_2 e_2)] 283 | if for all multi-language contexts @render-term[ANFL C], @render-term[ANFL 284 | (H_1 (in-hole C e_1))] and @render-term[ANFL (H_2 (in-hole C e_2))] co-terminate 285 | in @(ANFL->-arrow).} 286 | 287 | We define secure compilation of both the AOT and JIT models as full abstraction: 288 | contextual equivalence is preserved and reflected through multi-language 289 | reduction. 290 | 291 | @mthm["Full Abstraction (multi-language)"]{ 292 | Suppose 293 | @render-term[ANFL (anf-eval->+ (H_1 e_1) ((prime H_1) (prime e_1)))] and 294 | @render-term[ANFL (anf-eval->+ (H_2 e_2) ((prime H_2) (prime e_2)))]. 295 | @exact{\\}Then 296 | @render-term[ANFL (H_1 e_1)] @exact{$\mathrel{\approx}$} @render-term[ANFL (H_2 e_2)] 297 | if and only if 298 | @render-term[ANFL ((prime H_1) (prime e_1))] @exact{$\mathrel{\approx}$} 299 | @render-term[ANFL ((prime H_2) (prime e_2))]. 300 | } 301 | 302 | The normally easy part of full abstraction, within the multi-language, is now a 303 | direct consequence of confluence, since both compilation and contextual 304 | equivalence are defined by multi-language reduction. 305 | The hard part, showing any multi-language context (attacker) is emulated by a 306 | source context, remains. 307 | 308 | @;{ 309 | Theorem (Contextual Equivalence implies Full Abstraction): If (e1 \approx e2) then (Suppose and e1 \Rightarrow e1' and e2 \Rightarrow e2'. e1 \approx e2 and e1 \Rightarrow e1' and e2 \Rightarrow e2' if and only iff e1' \approx e2') (where \approx is contextual equivalence, and \Rightarrow is single-step multi-language reduction). 310 | 311 | This theorem is a bit silly. Under the first premise, the first premise of full abstraction is completely unnecessary, so this statement devolves into the statement for full abstraction, I think. We can immediately simplify: 312 | 313 | It suffices to prove just full abstraction: (Suppose and e1 \Rightarrow e1' and e2 \Rightarrow e2'. Then e1 \approx e2 if and only iff e1' \approx e2') (where \approx is contextual equivalence, and \Rightarrow is single-step multi-language reduction). 314 | 315 | Now, we can easily prove the if case. Assume e1 \approx e2 and e1 \Rightarrow 316 | e1' and e2 \Rightarrow e2'. We must show e1' \approx e2'. 317 | Assume an arbitrary context C, we must show C[e1'] and C[e2'] co-terminate. 318 | Instantiate our hypothesis e1 \approx e2 with C; we learn that C[e1] and C[e2] co-terminate. 319 | Observe that reduction in the multi-language is the same relation used by 320 | contextual equivalence (I think this is what you were trying to observe). Since 321 | e1 \approx e2, we know that C[e1] \Rightarrow* v and C[e2] \Rightarrow* v, or 322 | both diverge. In the former case, by confluence, C[e1] \Rightarrow e1' 323 | \Rightarrow* v and C[e2] \Rightarrow e2' \Rightarrow* v, so clearly C[e1] 324 | \approx C[e2']. Similarly in the latter case. 325 | 326 | We must now show the only-if case: Assuming C[e1'] \approx C[e2'] and e1 \Rightarrow e1' and e2 \Rightarrow e2', we must show C[e1] \approx C[e2]. The proof is similar to the previous case. 327 | 328 | Unfortunately, even this isn't quite full abstraction. Really, we want the statement to be purely in terms of source and target contexts. Ideally, this theorem would imply that theorem. 329 | 330 | But that seems unlikely; this theorem is extremely strange. We never once had to reason about the translation itself or make any assumptions. Full abstraction just holds by construction. This is worrying. Either we have a bug in the proof, or our definitions make full abstraction an uninteresting property. 331 | } 332 | 333 | @(generate-bibliography) 334 | -------------------------------------------------------------------------------- /defs.rkt: -------------------------------------------------------------------------------- 1 | #lang racket/base 2 | 3 | (require 4 | retex 5 | redex/pict 6 | redex/reduction-semantics 7 | file/convertible 8 | racket/function 9 | pict 10 | (prefix-in pict: pict) 11 | (only-in rackunit require/expose) 12 | pict/convert 13 | racket/format 14 | racket/match 15 | (for-syntax 16 | racket/base 17 | syntax/parse) 18 | scribble/base 19 | retex 20 | "lambda-s.rkt" 21 | (only-in 22 | scribble/core 23 | link-render-style 24 | current-link-render-style) 25 | (only-in scriblib/figure [Figure-ref pre:Figure-ref]) 26 | (except-in scriblib/figure Figure-ref) 27 | with-cache) 28 | 29 | (require/expose pict/private/pict (converted-pict-parent converted-pict?)) 30 | 31 | (provide 32 | (all-from-out scriblib/figure) 33 | (rename-out 34 | [_render-metafunction render-metafunction] 35 | [_render-judgment-form render-judgment-form]) 36 | render-judgment-form-rows 37 | render-mathpar-judgment 38 | extend-language-show-union 39 | union-reduction-relations 40 | reduction-relation 41 | extend-language-show-extended-order 42 | term 43 | (all-defined-out)) 44 | 45 | (define ie (emph "i.e.")) 46 | 47 | (*use-cache?* #f) 48 | (define source-lang (elem "λ" (elem #:style 'superscript "S"))) 49 | (define anf-lang (elem "λ" (elem #:style 'superscript "A"))) 50 | (define anf-multi-lang (elem "λ" (elem #:style 'superscript "sa"))) 51 | 52 | (define todo margin-note) 53 | 54 | (define (Figure-ref e) 55 | (pre:Figure-ref e #:link-render-style (link-render-style 'default))) 56 | 57 | (define-syntax-rule (render-language e ...) 58 | (with-paper-rewriters (render-language-cache e ...))) 59 | 60 | (define-syntax-rule (_render-metafunction e ...) 61 | (with-paper-rewriters (render-metafunction e ...))) 62 | 63 | (define-syntax-rule (_render-judgment-form e ...) 64 | (with-paper-rewriters (render-judgment-form e ...))) 65 | 66 | (define-syntax-rule (render-reduction-relation e ...) 67 | (with-paper-rewriters (render-reduction-relation-cache e ...))) 68 | 69 | (define-syntax-rule (render-term e ...) 70 | (with-paper-rewriters (render-term-cache e ...))) 71 | 72 | (define-syntax-rule (render-src e) 73 | (render-term λiL e)) 74 | 75 | (define (apply-reduction-relation-n red n e) 76 | (apply-reduction-relation* 77 | red 78 | e 79 | #:stop-when 80 | (let ([b (box 0)]) 81 | (lambda (x) 82 | (begin 83 | (set-box! b (add1 (unbox b))) 84 | (> (unbox b) n)))))) 85 | 86 | (define-syntax (render-step stx) 87 | (syntax-case stx () 88 | [(_ lang red arrow n e) 89 | #`(with-paper-rewriters 90 | (vl-append 91 | (render-term lang e) 92 | #,@(for/list ([i (in-range 1 (add1 (syntax-e #'n)))]) 93 | #`(hc-append 94 | (pad-arrow arrow) 95 | (with-paper-rewriters 96 | (render-term/pretty-write 97 | lang 98 | (car (apply-reduction-relation-n red #,i (term e)))))))))])) 99 | 100 | (define-syntax-rule (render-prefix-and-finish lang red arrow n e) 101 | (with-paper-rewriters 102 | (vl-append 103 | (render-step lang red arrow n e) 104 | (hc-append 105 | (pad-arrow (star-arrow arrow)) 106 | (with-paper-rewriters (render-term/pretty-write lang (car (apply-reduction-relation* red (term e))))))))) 107 | 108 | #;(default-font-size 12) 109 | #;(metafunction-font-size 10) 110 | #;(label-font-size 9) 111 | 112 | ; "Linux Libertine" might be okay? 113 | (define Linux-Liberterine-name "Linux Libertine") 114 | (define Inconsolata-name "Inconsolata") 115 | #;(define LatinModernMath-Regular-name "Latin Modern Math") 116 | (require racket/draw) 117 | (define (check-font name) 118 | (unless (member name (get-face-list)) 119 | (error 'check-font "expected the font ~a to be installed\n" name))) 120 | (check-font Inconsolata-name) 121 | (check-font Linux-Liberterine-name) 122 | #;(check-font LatinModernMath-Regular-name) 123 | 124 | #;(define math-style Linux-Liberterine-name) 125 | 126 | #;(greek-style 'roman) 127 | #;(upgreek-style 'roman) 128 | #;(metafunction-style 'swiss) 129 | #;(label-style 'swiss) 130 | #;(default-style math-style) 131 | #;(literal-style math-style) 132 | #;(paren-style 'roman) 133 | #;(grammar-style (cons 'italic 'roman)) 134 | 135 | (define (replace-font param) 136 | (let loop ([x (param)]) 137 | (cond 138 | [(pair? x) (cons (car x) (loop (cdr x)))] 139 | [else Linux-Liberterine-name]))) 140 | 141 | (define (def-t str) (text str (default-style) (default-font-size))) 142 | (define (mf-t str) (text str (metafunction-style) (metafunction-font-size))) 143 | (define (nt-t str) (text str (non-terminal-style) (default-font-size))) 144 | (define (nt-sub-t str) (text str (cons 'subscript (non-terminal-style)) (default-font-size))) 145 | (define (literal-t str) (text str (literal-style) (default-font-size))) 146 | 147 | 148 | (struct pict+tag (pict tag) 149 | #:property prop:pict-convertible 150 | (lambda (x) (pict+tag-pict x)) 151 | #:property prop:convertible 152 | (lambda (v r d) 153 | (case r 154 | [(text) (pict+tag-tag v)] 155 | [else (convert (pict+tag-pict v) r d)]))) 156 | 157 | (define (compute-tag base ss) 158 | (define (to-string x) 159 | ((match x 160 | [(? string?) values] 161 | [(or (? number?) (? symbol?)) ~a] 162 | [(? lw?) (const "")] 163 | [(? pict+tag?) pict+tag-tag] 164 | [(? pict?) compute-tag2]) 165 | x)) 166 | (apply 167 | string-append 168 | (to-string base) 169 | (map to-string ss))) 170 | 171 | (define (compute-tag2 p) 172 | (or (compute-tag2* p) "")) 173 | 174 | (define (compute-tag2* p) 175 | (cond 176 | [(and (converted-pict? p) 177 | (pict+tag? (converted-pict-parent p))) 178 | (pict+tag-tag (converted-pict-parent p))] 179 | [else 180 | (let loop ([v #f] 181 | [l (pict-children p)]) 182 | (cond 183 | [(null? l) v] 184 | [else 185 | (define x (compute-tag2 (child-pict (car l)))) 186 | (define next 187 | (cond 188 | [(and x v) 189 | (string-append v x)] 190 | [else (or x v)])) 191 | (loop next (cdr l))]))])) 192 | 193 | (define (lift-to-taggable pict tag) 194 | (if (pict+tag? pict) 195 | (pict+tag (pict+tag-pict pict) tag) 196 | (pict+tag pict tag))) 197 | 198 | (define (text t f [s 12] #:kern? [kern? #t]) 199 | (lift-to-taggable 200 | (if kern? 201 | (kern-text t f s) 202 | (pict:text t f s)) 203 | t)) 204 | 205 | (define (kern-text t f s) 206 | (define split (breakout-manual-adjustment t)) 207 | (apply hbl-append 208 | (for/list ([segement (in-list split)]) 209 | (if (or (pict-convertible? segement) (pict? segement)) 210 | segement 211 | (pict:text segement f s))))) 212 | 213 | (define hookup 214 | (drop-below-ascent 215 | (text "⇀" Linux-Liberterine-name (default-font-size) #:kern? #f) 216 | 2)) 217 | (define hookdown 218 | (drop-below-ascent 219 | (text "⇁" Linux-Liberterine-name (default-font-size) #:kern? #f) 220 | 2)) 221 | (define right 222 | (text "⟶" Linux-Liberterine-name (default-font-size) #:kern? #f)) 223 | 224 | ;; TODO Should be a parameter. 225 | (define adjustment-table 226 | (hash 227 | #\⇀ hookup 228 | #\⇁ hookdown 229 | #\⟶ right)) 230 | 231 | (define (breakout-manual-adjustment t) 232 | (define (stringify x) 233 | (apply string (reverse x))) 234 | (for/fold ([current '()] 235 | [all '()] 236 | #:result (reverse (cons (stringify current) all))) 237 | ([c (in-string t)]) 238 | (match (hash-ref adjustment-table c c) 239 | [(or (? pict-convertible? x) (? pict? x)) 240 | (values '() (list* x (stringify current) all))] 241 | [(? char? c) (values (cons c current) all)]))) 242 | 243 | (define (words str) 244 | (text str (default-style) (default-font-size))) 245 | 246 | #;(define (typeset-supers s) 247 | (render-word-sequence (blank) s +2/5)) 248 | #;(define (typeset-subs s) 249 | (render-word-sequence (blank) s -2/5)) 250 | #;(define (render-word-sequence base s l) 251 | (define p 252 | (for/fold ([p base]) 253 | ([s (in-list s)]) 254 | (hbl-append 255 | p 256 | (scale 257 | (cond [(string? s) (words s)] 258 | [(or (number? s) (symbol? s)) (words (~a s))] 259 | [(pict-convertible? s) s] 260 | [(lw? s) (render-lw (default-language) s)] 261 | [else (error 'render-op "don't know how to render ~v" s)]) 262 | .7)))) 263 | (lift-bottom-relative-to-baseline 264 | p 265 | (* l (pict-height p)))) 266 | 267 | (define (render-op p [x #f]) 268 | (define s (~a (if x x p))) 269 | (define head 270 | (hbl-append 271 | (if x p (blank)) 272 | (match (regexp-match* #rx"^[^^_]*" s) 273 | [(cons r _) (words r)] 274 | [_ (blank)]))) 275 | (define tails (regexp-match* #rx"(_|\\^)[^^_]*" s)) 276 | (render-op/instructions head tails)) 277 | 278 | (define (render-op/instructions base ss) 279 | (define-values (supers subs seq) 280 | (for/fold ([super '()] 281 | [sub '()] 282 | [seq '()] 283 | #:result (values (reverse super) (reverse sub) (reverse seq))) 284 | ([s (in-list ss)]) 285 | (match s 286 | [(or (regexp #rx"\\^(.*)" (list _ r)) 287 | (list 'superscript r)) 288 | (values (cons r super) sub (cons r seq))] 289 | [(or (regexp #rx"_(.*)" (list _ r)) 290 | (list 'subscript r)) 291 | (values super (cons r sub) (cons r seq))]))) 292 | (define the-super ""#;(typeset-supers supers)) 293 | (define the-sub ""#;(typeset-subs subs)) 294 | (lift-to-taggable 295 | (inset 296 | (refocus 297 | (hbl-append 298 | (refocus (hbl-append base the-sub) base) 299 | the-super) 300 | base) 301 | 0 302 | 0 303 | (max (pict-width the-sub) (pict-width the-super)) 304 | 0) 305 | (compute-tag base seq))) 306 | 307 | (define (collapse-consecutive-spaces l) 308 | (match l 309 | [(or (list _) (list)) l] 310 | [(list* "" "" r) 311 | (collapse-consecutive-spaces (cons "" r))] 312 | [(cons a b) 313 | (cons a (collapse-consecutive-spaces b))])) 314 | 315 | (define (binop op lws) 316 | (define left (list-ref lws 2)) 317 | (define right (list-ref lws 3)) 318 | (append (do-binop op left right) 319 | (list right ""))) 320 | 321 | (define (do-binop op left right [splice #f]) 322 | (define space (text " " (default-style) (default-font-size))) 323 | (append (list "") 324 | (if splice (list splice (just-after left splice)) (list left)) 325 | (list 326 | (just-after 327 | (hbl-append 328 | space 329 | (if (pict-convertible? op) op (render-op op)) 330 | space) 331 | left)) 332 | (list ""))) 333 | 334 | (define (infix op lws) 335 | (define all (reverse (cdr (reverse (cdr (cdr lws)))))) 336 | (collapse-consecutive-spaces 337 | (let loop ([all all]) 338 | (match all 339 | [(list* x (and dots (struct* lw ([e (or '... "...")]))) y rst) 340 | (append (do-binop op dots y x) (list "") 341 | (loop (cons y rst)))] 342 | [(list* x (and dots (struct* lw ([e (or '... "...")]))) rst) 343 | (list x dots "")] 344 | [(list* x y rst) 345 | (append (do-binop op x y) (list "") 346 | (loop (cons y rst)))] 347 | [(list x) (list x "")])))) 348 | 349 | 350 | (define (name-arrow name base [r 2] [t -3]) 351 | (with-paper-rewriters 352 | (pin-over 353 | base 354 | r t 355 | (text name Linux-Liberterine-name 7)))) 356 | 357 | (define (star-arrow base) 358 | (hbl-append base (inset (def-t "*") -2 0 0 0))) 359 | 360 | (define (λs->-arrow) 361 | (name-arrow "λs" (def-t "→"))) 362 | 363 | (define (λa->-arrow) 364 | (name-arrow "λa" (def-t "→"))) 365 | 366 | ;; Rewriters! 367 | (set-arrow-pict! 368 | '--> 369 | (lambda () 370 | (with-paper-rewriters/proc 371 | (lambda () 372 | (def-t "→"))))) 373 | 374 | (set-arrow-pict! '-->λs λs->-arrow) 375 | (set-arrow-pict! '-->λa λs->-arrow) 376 | 377 | (define (a->-arrow) (with-paper-rewriters (def-t "→ᵃ"))) 378 | (define (st->-arrow) (with-paper-rewriters (def-t "→ˢᵗ"))) 379 | 380 | (set-arrow-pict! '-->a a->-arrow) 381 | 382 | (set-arrow-pict! '-->st st->-arrow) 383 | 384 | (define (ANFL->-arrow) 385 | (name-arrow "λsa" (def-t "⇒") 1)) 386 | 387 | (define (ANFL->*-arrow) 388 | (with-paper-rewriters 389 | (star-arrow (ANFL->-arrow)))) 390 | 391 | (define (anf->+-arrow) 392 | (with-paper-rewriters (def-t "ˢ→ᵃ"))) 393 | 394 | (define (pad-arrow p) 395 | (hbl-append (def-t " ") p (def-t " "))) 396 | 397 | (define (anf-compile-arrow) 398 | (with-paper-rewriters (def-t "⇓ᵃⁿᶠ"))) 399 | 400 | (define (with-paper-rewriters/proc thunk) 401 | (with-compound-rewriters 402 | (['prime 403 | (λ (lws) 404 | (list "" 405 | (list-ref lws 2) 406 | "′"))] 407 | ['≡ 408 | (curry binop "≡")] 409 | ['not-equal? 410 | (λ (lws) 411 | (list "" 412 | (list-ref lws 2) 413 | (def-t " ≢ ") 414 | (list-ref lws 3) 415 | ""))] 416 | ['substitute 417 | (λ (lws) 418 | (list "" 419 | (list-ref lws 2) 420 | (def-t "[") 421 | (list-ref lws 3) 422 | (def-t " := ") 423 | (list-ref lws 4) 424 | ""))] 425 | ['subst-all 426 | (λ (lws) 427 | (list "" 428 | (list-ref lws 2) 429 | (def-t "[") 430 | (list-ref lws 3) 431 | (def-t " := ") 432 | (list-ref lws 4) 433 | ""))] 434 | ['non-Cn? 435 | (λ (lws) 436 | (list "" 437 | (list-ref lws 2) 438 | (def-t " ∉ ") 439 | (nt-t "A.Cn")))] 440 | ['non-Cm? 441 | (λ (lws) 442 | (list "" 443 | (list-ref lws 2) 444 | (def-t " ∉ ") 445 | (nt-t "A.Cm")))] 446 | ['non-Tv? 447 | (λ (lws) 448 | (list "" 449 | (list-ref lws 2) 450 | (def-t " ∉ ") 451 | (nt-t "A.v")))] 452 | ['non-boolean? 453 | (λ (lws) 454 | (list "" 455 | (list-ref lws 2) 456 | (def-t " ∉ ") 457 | (nt-t "boolean")))] 458 | ['non-fixnum? 459 | (λ (lws) 460 | (list "" 461 | (list-ref lws 2) 462 | (def-t " ∉ ") 463 | (nt-t "fixnum")))] 464 | ['non-false? 465 | (λ (lws) 466 | (list "" 467 | (list-ref lws 2) 468 | (def-t " ≠ ") 469 | (literal-t "#f")))] 470 | ['non-fv? 471 | (λ (lws) 472 | (list "" 473 | (list-ref lws 2) 474 | (def-t " ∉ ") 475 | (nt-t "fv")))] 476 | ['→ 477 | (λ (lws) 478 | (list "" 479 | (list-ref lws 2) 480 | (def-t " → ") 481 | (list-ref lws 3) 482 | ""))] 483 | ['λi->j 484 | (λ (lws) 485 | (list "" 486 | (list-ref lws 2) 487 | (pad-arrow (λs->-arrow)) 488 | (list-ref lws 3) 489 | ""))] 490 | ['λi->j* 491 | (λ (lws) 492 | (list "" 493 | (list-ref lws 2) 494 | (pad-arrow (star-arrow (λs->-arrow))) 495 | (list-ref lws 3) 496 | ""))] 497 | ['λa->j 498 | (λ (lws) 499 | (list "" 500 | (list-ref lws 2) 501 | (pad-arrow (λa->-arrow)) 502 | (list-ref lws 3) 503 | ""))] 504 | ['λa->j* 505 | (λ (lws) 506 | (list "" 507 | (list-ref lws 2) 508 | (pad-arrow (star-arrow (λa->-arrow))) 509 | (list-ref lws 3) 510 | ""))] 511 | ['anf->+j 512 | (λ (lws) 513 | (list "" 514 | (list-ref lws 2) 515 | (pad-arrow (anf->+-arrow)) 516 | (list-ref lws 3) 517 | ""))] 518 | ['anf->*j 519 | (λ (lws) 520 | (list "" 521 | (list-ref lws 2) 522 | (def-t " ˢ→ᵃ* ") 523 | (list-ref lws 3) 524 | ""))] 525 | ['anf->j 526 | (λ (lws) 527 | (list "" 528 | (list-ref lws 2) 529 | (pad-arrow (a->-arrow)) 530 | (list-ref lws 3) 531 | ""))] 532 | ['st->j 533 | (λ (lws) 534 | (list "" 535 | (list-ref lws 2) 536 | (pad-arrow (st->-arrow)) 537 | (list-ref lws 3) 538 | ""))] 539 | ['not-anf->+j 540 | (λ (lws) 541 | (list "" 542 | (list-ref lws 2) 543 | (pad-arrow (def-t "ˢ↛ᵃ")) 544 | ""))] 545 | ['anf-compile 546 | (λ (lws) 547 | (list "" 548 | (list-ref lws 2) 549 | (pad-arrow (anf-compile-arrow)) 550 | (list-ref lws 3) 551 | ""))] 552 | ['anf-eval->+ 553 | (λ (lws) 554 | (list "" 555 | (list-ref lws 2) 556 | (pad-arrow (ANFL->-arrow)) 557 | (list-ref lws 3) 558 | ""))] 559 | ['anf-eval->* 560 | (λ (lws) 561 | (list "" 562 | (list-ref lws 2) 563 | (pad-arrow (ANFL->*-arrow)) 564 | (list-ref lws 3) 565 | ""))] 566 | ['→* 567 | (λ (lws) 568 | (list "" 569 | (list-ref lws 4) 570 | (hbl-append (def-t " →") 571 | (inset (def-t "*") -2 0 0 0) 572 | (def-t " ")) 573 | (list-ref lws 5) 574 | ""))] 575 | ['ANFL-types 576 | (λ (lws) 577 | (list "" 578 | (list-ref lws 2) 579 | " ⊢ " 580 | (list-ref lws 3) 581 | " : " 582 | (list-ref lws 4)))] 583 | ['λiL-types 584 | (λ (lws) 585 | (list "" 586 | (list-ref lws 2) 587 | " ⊢ " 588 | (list-ref lws 3) 589 | " : " 590 | (list-ref lws 4)))] 591 | ['λaL-types 592 | (λ (lws) 593 | (list "" 594 | (list-ref lws 2) 595 | " ⊢ " 596 | (list-ref lws 3) 597 | " : " 598 | (list-ref lws 4)))] 599 | ) 600 | (with-atomic-rewriters 601 | (;; because · renders as {} for environment sets. 602 | ['dot (λ () (text "·" (default-style) (default-font-size)))] 603 | ;; render nat and mat as n and m for the proofs 604 | ['nat (λ () (text "n" (non-terminal-style) (default-font-size)))] 605 | ['hole (λ () (def-t "[·]"))]) 606 | (begin 607 | (define owsb (white-square-bracket)) 608 | (parameterize* ([default-font-size (get-the-font-size)] 609 | [metafunction-font-size (get-the-font-size)] 610 | [label-style Linux-Liberterine-name] 611 | [grammar-style Linux-Liberterine-name] 612 | [paren-style Linux-Liberterine-name] 613 | [literal-style Linux-Liberterine-name] 614 | [metafunction-style (cons 'bold Linux-Liberterine-name) 615 | #;(cons 'italic Linux-Liberterine-name)] 616 | [non-terminal-style (cons 'italic 617 | Linux-Liberterine-name) 618 | #;(cons 'bold Linux-Liberterine-name)] 619 | [non-terminal-subscript-style (replace-font non-terminal-subscript-style)] 620 | [non-terminal-superscript-style (replace-font non-terminal-superscript-style)] 621 | [default-style Linux-Liberterine-name] 622 | [white-square-bracket 623 | (lambda (open?) 624 | (let ([text (current-text)]) 625 | (define s (ghost (owsb open?))) 626 | (refocus 627 | (lbl-superimpose 628 | (scale 629 | (text (if open? "⟬" "⟭") 630 | (default-style) 631 | (default-font-size)) 632 | 1.05) 633 | s) 634 | s)))]) 635 | (thunk)))))) 636 | 637 | (define in-footnote? (make-parameter #f)) 638 | (define (get-the-font-size) (if (in-footnote?) 9 12)) 639 | 640 | (define-syntax-rule (-note . args) 641 | (parameterize ([in-footnote? #t]) 642 | (note . args))) 643 | 644 | (define-syntax with-paper-rewriters 645 | (syntax-parser 646 | [(_ e1 e ...) 647 | (quasisyntax/loc this-syntax 648 | (with-paper-rewriters/proc 649 | #,(syntax/loc this-syntax (λ () e1 e ...))))])) 650 | -------------------------------------------------------------------------------- /a-normal-form.scrbl: -------------------------------------------------------------------------------- 1 | #lang scribble/acmart @acmsmall @nonacm @screen 2 | @(require 3 | "bib.rkt" 4 | scriblib/footnote 5 | latex-utils/scribble/theorem 6 | latex-utils/scribble/utils 7 | "lambda-s.rkt" 8 | "lambda-a.rkt" 9 | "anf.rkt" 10 | "defs.rkt") 11 | 12 | @title{A-normalization} 13 | Our first pass translates to A-normal form (ANF) to make data flow explicit 14 | in the syntax. 15 | CPS is frequently studied in compiler correctness as this first 16 | pass@~cite{morrisett1998:ftotal kennedy2007 ahmed2011 perconti2014 bowman2018:cps-sigma}. 17 | CPS has the advantage of making both control flow and data flow explicit in the 18 | syntax. 19 | ANF, meanwhile, leaves some aspects of control flow implicit, in particular the 20 | call-and-return structure of non-tail function calls, and evaluation order. 21 | We use ANF since there is already a presentation of ANF as a reduction relation, 22 | namely, the A-reductions@~cite{sabry1992 flanagan1993}, which served as 23 | inspiration for the present work and a good starting point. 24 | 25 | We extend the A-reductions to support some Scheme-like imperative features, and 26 | make explicit the multi-language semantics@~cite{matthews2007} of the reduction 27 | system that was implicit in earlier presentations of the A-reductions. 28 | 29 | @section{ANF Language} 30 | @figure["fig:anf-syntax" @elem{@|anf-lang| Syntax} 31 | (render-language λaL #:nts '(v n e)) 32 | ] 33 | 34 | We specify the target language as essentially the source language but with the 35 | syntax restricted to A-normal form. 36 | This is defined in @Figure-ref{fig:anf-syntax}. 37 | 38 | ANF specifies a syntactic distinction between values @render-term[λaL v], 39 | computations @render-term[λaL n], and configurations @render-term[λaL e]. 40 | All elimination forms work directly on values rather than arbitrary expressions, 41 | so control must be manually composed using @render-term[λaL let] and 42 | @render-term[λaL begin]. 43 | 44 | We can consider ANF as normalized monadic form for the continuation monad. 45 | The monad implements @render-term[λaL bind] as @render-term[λaL let] and 46 | @render-term[λaL begin] and @render-term[λaL return] is implicit by the include 47 | of @render-term[λaL V] in the operations of the monad @render-term[λaL n]. 48 | The form is normal with respect the following commuting conversion. 49 | @(require redex/reduction-semantics) 50 | @render-reduction-relation[ 51 | (reduction-relation 52 | λaL 53 | (--> (bind ([x_1 (bind ([x_2 n_2]) e_2)]) e_1) 54 | (bind ([x_2 n_2]) (bind ([x_1 e_2]) e_1)))) 55 | #:style 'horizontal 56 | ] 57 | 58 | @section{Dynamic Semantics} 59 | 60 | @figure["fig:anf-eval-syn" @elem{@|anf-lang| Evaluation Contexts} 61 | (render-language λaL-eval #:nts '(E v)) 62 | ] 63 | Since data flow is explicit in the syntax, we no longer need a complex 64 | evaluation context to specify how to compute values from subexpressions. 65 | Instead, we only need to choose the evaluation order of simple computations. 66 | In @Figure-ref{fig:anf-lang-syn}, we define the evaluation context 67 | @render-term[λaL-eval E], and a separate class of run-time values 68 | @render-term[λaL-eval v]. 69 | @note{We could eliminate the evaluation context entirely by removing multi-arity 70 | @render-term[λaL let] and @render-term[λaL begin] from the language, but this 71 | only obscures the fact that evaluation order must still be specified in ANF, and 72 | a control stack is still required in the semantics. 73 | There's also no particular advantage to trying to simplify the evaluation 74 | context.} 75 | 76 | Note that, by contrast, the CPS translation would be responsible for encoding 77 | control flow information, including evaluation order, in the syntax as and data 78 | flow. 79 | Using ANF offers a minor advantage to to implementing lazy languages, as the 80 | compiler does not need to change the syntax in order to change evaluation 81 | strategy. 82 | 83 | @(require (only-in pict vc-append vl-append hc-append)) 84 | @figure["fig:anf-lang-red" @elem{@|anf-lang| Reduction} 85 | (render-reduction-relation 86 | (union-reduction-relations λa->composition λa->admin λa->bools) 87 | #:style 'horizontal) 88 | ] 89 | 90 | The reduction system for @|anf-lang| is not significantly different from @|source-lang|. 91 | We give the example rules in @Figure-ref{fig:anf-lang-red}. 92 | The main differences are in the definition of the evaluation contexts, and in 93 | the reduction rules for function calls and composition forms. 94 | The composition forms @render-term[λaL letrec], @render-term[λaL let], and 95 | @render-term[λaL begin] now only appear at the top-level and not under an 96 | evaluation, as does @render-term[λaL if]. 97 | 98 | @figure["fig:anf-lang-hcomp" @elem{@|anf-lang| Heterogeneous Composition} 99 | (render-metafunction hcompose) 100 | ] 101 | 102 | The definition of β-reduction is slightly complicated since substitution must 103 | replace a value, the @render-term[λaL hole] of the evaluation context, by a 104 | configuration @render-term[λaL e] from the body of the function. 105 | This is not well-defined using standard substition in ANF (although it is in 106 | monadic form), since it would not preserve the normal form. 107 | Instead, we define a heterogeneous substitution metafunction @render-term[λaL 108 | hcompose] in @Figure-ref{fig:anf-lang-hcomp}. 109 | Note that this definition duplicates code in the branches of @render-term[λaL 110 | if]; this is intentional in the specification of the ANF semantics, but we avoid 111 | it by using join-point optimization during compilation to ANF. 112 | 113 | @section{The @|source-lang|/@|anf-lang| multi-language} 114 | @figure["fig:anf-multi-syn" @elem{@|anf-multi-lang| Syntax (excerpts)} 115 | (parameterize ([extend-language-show-union #t] 116 | [extend-language-show-extended-order #t]) 117 | (render-language ANFL #:nts '(A.e A.n A.v S.e e))) 118 | ] 119 | 120 | Next we define a multi-language semantics@~cite{matthews2007} |source-lang| + 121 | @|anf-lang|, which we name @|anf-multi-lang| 122 | We start by defining standard multi-language features, then present the unique 123 | changes for modeling compilation as a multi-language semantics. 124 | 125 | We define the standard multi-language syntax for @|anf-multi-lang| in 126 | @Figure-ref{fig:anf-multi-syn}. 127 | The syntax is defined essentially by merging the syntax of 128 | @|source-lang| and @|anf-lang|. 129 | First, we introduce tagged non-terminals: @render-term[ANFL S.e] for purely 130 | source and @render-term[ANFL A.e] for purely target terms. 131 | These are each extended with a boundary term: @render-term[ANFL (SA A.e)] for an 132 | embedding of a target term in a source term ("Source on the outside, ANF on 133 | the inside"), and @render-term[ANFL (AS S.e)] for embedding a target term in a 134 | source term ("ANF on the outside, Source on the inside"). 135 | Then we define the multi-language expressions @render-term[ANFL e] as either a 136 | source or target term. 137 | 138 | @figure["fig:anf-boundary-red" @elem{@|anf-multi-lang| Boundary Reductions} 139 | (render-reduction-relation st-> #:style 'horizontal) 140 | ] 141 | 142 | We also define the mostly standard boundary cancelation reductions in 143 | @Figure-ref{fig:anf-boundary-red}. 144 | We depart slightly by allowing reduction in arbitrary program contexts rather 145 | than evaluation context, reflecting the fact that compilation happens for any 146 | subterm rather than only those being evaluated. 147 | 148 | @section{Multi-language A-reductions} 149 | @figure["fig:a-red" @elem{The A-reductions} 150 | (render-reduction-relation anf-> #:style 'horizontal) 151 | ] 152 | 153 | Now we can define the A-reductions, given in @Figure-ref{fig:a-red}. 154 | The translation is defined over @render-term[ANFL S.e] expressions, that is, 155 | source expressions in the multi-language. 156 | The first rule reduces any term that happens to also be a target language 157 | expression and wraps it in an @render-term[ANFL SA] boundary, embedding the 158 | target expression properly. 159 | This step was implicit in by the reflexive closure of the A-reductions in the 160 | original presentation. 161 | The rest follow the same pattern as the original reductions. 162 | When a @render-term[ANFL let] expression appears in evaluation contexts, we 163 | merge the code across the declaration (implicitly renaming if necessary). 164 | Similarly with @render-term[ANFL letrec] and @render-term[ANFL begin]. 165 | We can also understand these as normalizing all commuting conversions. 166 | For @render-term[ANFL if], we merge code across the branches of the 167 | @render-term[ANFL if] expression. 168 | Unlike @~cite{flanagan1993}, we use the join-point optimization to prevent code 169 | duplication. 170 | When the @render-term[ANFL if] appears in non-tail position with respect to the 171 | evaluation context, we introduce a join point, a local explicit continuationm 172 | and call the join point in the branches. 173 | When the @render-term[ANFL if] appears in tail position, we need not push the 174 | evaluation context into the branches. 175 | In turns out that this can only happen when the evaluation context is trivial. 176 | Finally, when a (non-value) computation appears in evaluation position, and in a 177 | non-bind context, we name the intermediate computation explicitly pass the value 178 | to the evaluation context. 179 | 180 | In addition to perform the standard A-reductions, we mark each language boundary 181 | explicitly. After A-reducing a @render-term[ANFL let], for exameple, the 182 | @render-term[ANFL let] itself is in the target language, so we use the 183 | @render-term[ANFL SA] boundary to embed the now-target term it its source 184 | context. 185 | However, its subexpressions are still source expressions, so we embed them in 186 | the now-target @render-term[ANFL let] using the @render-term[ANFL AS] boundary. 187 | Note that this makes the subexpression appear in translation context, while the 188 | @render-term[ANFL SA] boundary means the @render-term[ANFL let] itself does not 189 | appear in translation context. 190 | 191 | @;Typically, we would next define reduction in the multi-language system. 192 | @;In the terms of @todo{citet matthews2007}, we would choose between a lump 193 | @;embedding or a natural embedding, to define the interoperability semantics of 194 | @;our multi-language system. 195 | 196 | In a typical multi-language semantics, we might allow the multi-language 197 | reductions to happen under either a source or target evaluation context. 198 | However, our goal is compilation rather than interoperability. 199 | We are not defining evaluation, but translation, so we design a more general 200 | program context, which we call the translation context. 201 | 202 | @;@figure["fig:anf-multi-trans-ctxt" @elem{@|anf-multi-lang| Translation Context} 203 | @; (parameterize ([extend-language-show-union #t] 204 | @; [extend-language-show-extended-order #t]) 205 | @; (render-language ANFL #:nts '(T))) 206 | @;] 207 | 208 | In @Figure-ref{fig:anf-trans-red}, we define a translation context 209 | @render-term[ANFL T]. 210 | This is a program context, a subset of general program context, under which a 211 | source term appears and should be translated into a target term. 212 | The definition is complicated slightly by the particulars of ANF, and is simpler 213 | in other translations. 214 | 215 | A translation context is any non-evaluation target languge context 216 | @render-term[ANFL A.Cm] appearing under a boundary @render-term[ANFL AS], 217 | written @render-term[ANFL (AS A.Cm)]. 218 | This can further be nested in an arbitrary multi-language program context, 219 | written @render-term[ANFL (in-hole C (AS A.Cm))]. 220 | The context @render-term[ANFL A.Cm] corresponds to any target language context in 221 | which a valid program can be constructed by plugging a configuration into the 222 | hole. 223 | @todo{Formally defined in hte appdendix.} 224 | @; @(render-language λaL #:nts '(Cm)) 225 | The inner context must be restricted to a non-evaluation context because the 226 | A-reduction manipulate the evaluation context, so the evaluation context must be 227 | part of the redex. 228 | The outer context is arbitrary, indicating that translation can proceed under 229 | any context 230 | 231 | @figure["fig:anf-trans-red" @elem{@|anf-multi-lang| Translation Reduction System} 232 | (hc-append 60 233 | (parameterize ([extend-language-show-union #t] 234 | [extend-language-show-extended-order #t]) 235 | (render-language ANFL #:nts '(T))) 236 | (with-paper-rewriters (render-judgment-form-rows anf->+j '(2)))) 237 | ] 238 | We define the translation reduction @(anf->+-arrow) in 239 | @Figure-ref{fig:anf-trans-red}, which combines either an A-reduction in 240 | translation context, or a boundary cancellation step. 241 | 242 | As an example, consider the reduction of following from @citet{flanagan1993}: 243 | @nested[#:style 'inset 244 | (render-prefix-and-finish ANFL anf->+ (anf->+-arrow) 6 (AS (+ (+ 2 2) (let ([x 1]) (f 1))))) 245 | ] 246 | 247 | We begin translation by embedding the source term in the multi-language in 248 | translation context, using the @render-term[ANFL AS] boundary. 249 | We contract the @render-term[ANFL let] redex, which adds the @render-term[ANFL 250 | SA] boundary around the whole expression, and wraps the 251 | subexpressions with the @render-term[ANFL AS] boundary. 252 | The next redex we contract is boundary cancellation. 253 | We proceed with another @render-term[ANFL let] redex, which merges the addition 254 | into the body of the declaration. 255 | Reduction continues until we reach ANF. 256 | 257 | @(require rackunit) 258 | @(check-exn values (lambda () (step 11 (term (AS (+ (+ 2 2) (let ([x 1]) (f 1)))))))) 259 | This example finishes in 3 steps in @~cite{flanagan1993}, but takes 10 steps in 260 | our multi-language presentation. 261 | This is because we make explicit the translation of source values into target 262 | values, and require extra boundary cancellation steps. 263 | In fact, we take a few short-cuts by allowing an arbitrary target term to reduce 264 | rather than reducing only target values, which would be more faithful for the 265 | multi-language interoperability semantics of @citet{matthews2007}, but is not 266 | really required for reduction as compiliation. 267 | 268 | @(define-syntax-rule (render-anf-eg e) 269 | (nested #:style 'code-inset 270 | (para "Example:") 271 | (tabular #:row-properties '((top)) (list (list "> " (render-term λaL (step (AS e)))))) 272 | (with-paper-rewriters (render-term/pretty-write λaL (term (compile-anf e)))))) 273 | 274 | We can define compilation as normalization with respect to the A-reductions and 275 | boundary reductions. 276 | 277 | @mdef["ANF Compilation by Normalization"] 278 | @render-judgment-form[anf-compile] 279 | 280 | @section{Compiler Correctness as Confluent Multi-Language Reduction} 281 | The multi-language semantics allows us to define a reduction system in which 282 | confluence is compiler correctness. 283 | The intuition is simple. 284 | We define a reduction system in which any embedded source term can either take a 285 | step in the source semantics, or take a translation step. 286 | Any target term can take a step in the target semantics. 287 | Confluence tells us that if we take a source step, then translate, then reduce, 288 | that's the same as translating then reducing, @ie confluence is forward 289 | simulation. 290 | 291 | @figure["fig:anf-multi-red" @elem{@|anf-multi-lang| Multi-language Reduction}]{ 292 | @(with-paper-rewriters (render-judgment-form-rows anf-eval->+'(2 2 1))) 293 | } 294 | First we define the multi-language reduction system (@Figure-ref{fig:anf-multi-red}}). 295 | The reduction system captures the the intution described above. 296 | If we have a source term, and it takes a step in the source semantics, then it 297 | takes a step in the multi-language reduction. 298 | We extend source reduction to apply under a @render-term[ANFL AS] boundary, 299 | essentially implementing the new @render-term[ANFL AS] evaluation context give 300 | the standard multi-language semantics. 301 | We analogous enable multi-language terms to reduce in the target semantics. 302 | And finally, we enable a term to take a step translation step, either performing 303 | an A-reduction or boundary cancellation step. 304 | This system defines the single-step relation, and we take its reflexive 305 | transitive closure to define the full multi-language evaluator. 306 | 307 | In the terms of @citet{matthews2007}, this reduction system is similar to a 308 | lump embedding. 309 | However, as our goal is compilation rather than interoperability, we give only a 310 | single directly to the natural embedding: source terms can be translated to 311 | target terms. 312 | Source term terms lump-embedded in the target are simply compiled before they 313 | can interoperate. 314 | Conversely, target terms embedded in the source remain target until the source 315 | is either finished running or compiling. 316 | 317 | Now we can define compiler correctness as confluence. 318 | 319 | @mthm[@elem{Confluence} #:tag "thm:anf:confluence"]{ 320 | If @render-term[ANFL (anf-eval->* (S e) (S_1 e_1))] and 321 | @render-term[ANFL (anf-eval->* (S e) (S_2 e_2))] then @exact{\\} 322 | @render-term[ANFL (anf-eval->* (S_1 e_1) (S_3 e_3))] and 323 | @render-term[ANFL (anf-eval->* (S_2 e_2) (S_3 e_3))] 324 | } 325 | 326 | @mcor[@elem{Whole-Program Correctness} #:tag "thm:anf:correct"]{ 327 | If 328 | @render-term[ANFL (λi->j* (() S.e) (S S.v))] and 329 | @render-term[ANFL (anf-compile S.e A.e)] then 330 | @render-term[ANFL (λa->j* (() A.e) (S A.v))] such that 331 | @render-term[ANFL A.v] is equal to 332 | @render-term[ANFL S.v]. 333 | } 334 | @tprf["Proof."]{ 335 | Note that @render-term[ANFL (λi->j* (() S.e) (S S.v))] implies 336 | @render-term[ANFL (anf-eval->* (() S.e) (S S.v))]. 337 | Similarly, @render-term[ANFL (anf-eval->* (S S.e) (S A.e))]. 338 | By confluence, there must exist some @render-term[ANFL S_3] and 339 | @render-term[ANFL e_3] such that 340 | @render-term[ANFL (anf-eval->* (S S.v) (S_3 e_3))] and 341 | @render-term[ANFL (anf-eval->* (S A.e) (S_3 e_3))]. 342 | Since values cannot step, we know @render-term[ANFL e_3] must be 343 | @render-term[ANFL S.v]. 344 | Since values are shared across languages, we pick @render-term[ANFL A.v] to be 345 | @render-term[ANFL S.v] and the goal is complete. 346 | } 347 | 348 | @;Unforunately, this may not save us much proof effort. 349 | @;The single-step reduction is not confluent, since a transation step may need to 350 | @;be followed by boundary cancelation before a target step can take place, so the 351 | @;proof of confluence of the evaluate is non-trivial. 352 | @;The simplest approach may be Takahashi's "universal common reduct", which 353 | @;essentially forces us to define the compiler as a translation. 354 | 355 | @section{Multi-Language Reduction as JIT Compilation} 356 | The multi-language evaluator captures the semantics of JIT compilation: at any 357 | point, a source expression can instead be translated to the target language, 358 | after which time it runs in the target language semantics. 359 | We could model speculative optimization as an @render-term[ANFL if] 360 | expression: the term @render-term[ANFL (SA (if e_p (AS e) e))], where 361 | @render-term[ANFL e_p] encodes the dynamic assumption under which the variant 362 | @render-term[ANFL (AS e)] is executed. 363 | Initially, this is just an embedded copy of the original unoptimized code. 364 | However, the multi-language evaluator semantics allow us to compile it. 365 | 366 | This is not a particularly realistic model of speculative execution. 367 | In particular, it does not capture fine-grained composable assumptions, nor 368 | deoptimization from models such as @~cite{flueckiger2018:jit}. 369 | However, the approach is more general: many compiler translations are easily 370 | modeled (as we show later) as multi-language semantics, and so we easily derive 371 | a model of source-to-assembly JIT compiler. 372 | Interesting future work would equip the multi-language semantics with better 373 | semantics for speculative optimization. 374 | 375 | @section{Type Preservation is Type Preservation} 376 | In type-preserving compilation, the goal is to preserve well typedness 377 | through compilation~@todo{cite TAL, TIL}. 378 | This requires designing a typed target language and translating types as well as 379 | terms. 380 | This provides a simple means of deciding whether linking is safe in the target 381 | language@todo{cite chlipala}, and provides some simple correctness conditions of the compiler and 382 | assists with debugging@todo{cite haskell report}. 383 | The main theorem is the @emph{type preservation theorem}, and is stated below. 384 | 385 | @mthm[@elem{Type Preservation (ala compiler correctness)} #:tag "type-pres"]{ 386 | If @render-term[λiL (λiL-types Γ e X)] and @render-term[ANFL (anf-compile e A.e)], 387 | @render-term[ANFL (anf-compile Γ A.Γ)], 388 | @render-term[ANFL (anf-compile X A.X)], 389 | then 390 | @render-term[λaL (λaL-types A.Γ A.e A.X)] 391 | } 392 | 393 | When proving syntactic type safety, the strategy is to break the proof into two 394 | lemmas: @emph{progress} and @emph{preservation}@todo{cite}. 395 | The latter lemma is the @emph{type preservation lemma}, which refers to proving 396 | that reduction preserves well typedness, and is also called @emph{subject 397 | reduction}, if one comes from a logic background. 398 | 399 | 400 | @mlem[@elem{Type Preservation (ala "progress and preservation")} #:tag "thm:subj-red"]{ 401 | If @render-term[λiL (λiL-types Γ e_1 A)] and @render-term[λiL (λa->j e_2 e_2)], 402 | then 403 | @render-term[λiL (λiL-types Γ e_2 A)]. 404 | } 405 | 406 | When the compiler is presented as a multi-language reduction system, the former 407 | is @emph{almost} a corollary of the latter. 408 | We need to generalize subject reduction, and slightly weaken derive type 409 | preservation. 410 | Subject reduction is generalized to the closure of the single step reduction, 411 | which is straightforward. 412 | Type preservation is weakened to forget the relationship of the target language 413 | types to the source language. 414 | 415 | @mthm[@elem{Type Preservation implies Type Preservation} #:tag "thm:type-pres-type-pres"]{ 416 | If (@render-term[ANFL (ANFL-types Γ e_1 A)] and @render-term[ANFL (anf->*j e_1 e_2)] 417 | implies @render-term[ANFL (ANF-types Γ e_2 A)]) then 418 | if (@render-term[λiL (λiL-types Γ e A)] and @render-term[ANFL (anf-compile e A.e)] then 419 | there exists @render-term[ANFL A.Γ] and @render-term[ANFL A.X] such that 420 | @render-term[λaL (λaL-types A.Γ A.e A.X)]). 421 | } 422 | @tprf[@elem{Proof.}]{ 423 | Since compilation, @(anf-compile-arrow), is defined as normalization with 424 | respect to the @(a->-arrow), the proof is simple. 425 | We instantiate the premise, subject reduction, with the derivation that 426 | @render-term[ANFL (anf->*j e A.e)]. 427 | This yield the fact that @render-term[ANFL (ANF-types Γ A.e X)]. 428 | Since @render-term[ANFL A.e] is purely target (@ie, has no boundary terms), and 429 | the multi-language type system allows source types only under a source-target 430 | boundary term, we know @render-term[ANFL Γ] and @render-term[ANFL X] must be purely target. 431 | This means the derivation @render-term[ANFL (ANF-types Γ A.e X)] is trivially a target derivation of 432 | @render-term[λaL (λaL-types A.Γ A.e A.X)], where @render-term[ANFL A.Γ] is 433 | picked to be @render-term[ANFL Γ] and @render-term[ANFL A.X] is picked to be 434 | @render-term[ANFL X]. 435 | } 436 | 437 | This slightly weaker type preservation theorem does leave room for one 438 | additional property that is typically guaranteed by a type preserving compiler: 439 | that there exists a (compositional) translation from source types to target 440 | types. 441 | This property is not specified as explicitly desirably in most of the 442 | type-preserving compilation literature, but does seem desirable. 443 | It is unclear if some analogous property of the multi-language semantics could 444 | guarantee this property. 445 | --------------------------------------------------------------------------------