├── .gitignore ├── LICENSE ├── README.md ├── poly.cabal ├── src ├── Elab.hs ├── Elab │ ├── Builtins.hs │ ├── Types.hs │ ├── Unify.hs │ └── Utils.hs ├── Main.hs ├── Poly.hs ├── Src.hs └── Utils.hs └── test.txt /.gitignore: -------------------------------------------------------------------------------- 1 | dist 2 | dist-newstyle 3 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright (c) 2021 Mark Barbone 2 | 3 | Permission is hereby granted, free of charge, to any person obtaining 4 | a copy of this software and associated documentation files (the 5 | "Software"), to deal in the Software without restriction, including 6 | without limitation the rights to use, copy, modify, merge, publish, 7 | distribute, sublicense, and/or sell copies of the Software, and to 8 | permit persons to whom the Software is furnished to do so, subject to 9 | the following conditions: 10 | 11 | The above copyright notice and this permission notice shall be included 12 | in all copies or substantial portions of the Software. 13 | 14 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 15 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 16 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. 17 | IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY 18 | CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, 19 | TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE 20 | SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 21 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # `poly`: a WIP compiler 2 | 3 | ## An example that works rn 4 | 5 | ```sml 6 | let 7 | fun id x = let val y = x in y end 8 | fun f (g : forall a. a -> a) = 3 + g 4 9 | in f id end 10 | ``` 11 | 12 | This parses, typechecks, and is lowered to the polymorphic IR. 13 | 14 | ## Language features 15 | 16 | **One-sentence summary:** an ML with higher-rank polymorphism, typeclasses, and 17 | SML-inspired syntax. 18 | 19 | **A more in-depth checklist:** 20 | - [X] Functions 21 | - [X] Polymorphism, let generalization, and the value restriction 22 | - [ ] Relaxed value restriction: track function arity and eta-expand 23 | point-free definitions if possible and needed 24 | - [X] Higher-rank (predicative) polymorphism and polymorphic subtyping 25 | - [X] Integers 26 | - [ ] Strings and other primitive types 27 | - [ ] ADTs and pattern matching (to do next) 28 | - [ ] Records 29 | - [ ] Haskell98-style typeclasses 30 | - [ ] Higher-kinded types 31 | - [ ] Modules, maybe 32 | 33 | Note: no polymorphic recursion, GADTs, or existentials, since they make 34 | monomorphisation impossible (or if not impossible, at least a lot harder). 35 | 36 | ## Compiler pipeline 37 | 38 | **One-sentence summary:** whole-program compilation with optional 39 | monomorphization and defunctionalization. 40 | 41 | **A more in-depth checklist:** 42 | 43 | **Parser** (in `Src`): 44 | - Works, mostly 45 | - Could use some tidying up 46 | - Produces a `Src.Exp` AST 47 | 48 | **Elaborator** (in `Elab` and `ElabUtils`): 49 | - Works, mostly 50 | - Consumes the `Src.Exp` AST and produces `Poly.Exp` IR 51 | - Internally, uses its own `ElabUtils.Ty` and translates it from/to `Src.Ty` 52 | and `Poly.Ty` 53 | 54 | **Polymorphic IR** (in `Poly`): 55 | - A typed ANF IR with System F-style explicit type application and abstraction 56 | - Type-preserving ANF optimizations on the IR 57 | - Nothing is implemented yet 58 | - [ ] Eta-contraction 59 | - [ ] Constant propagation, beta reduction 60 | - [ ] Inlining and DCE 61 | - [ ] Stretch goal: commuting conversions and join points 62 | - [ ] Stretch goal: user-defined rewrite rules 63 | 64 | **Monomorphic IR** (unimplemented): 65 | - A (mostly) typed monomorphic (ANF? SSA?) IR 66 | - Two ways to get there: 67 | - Type erasure: replace polymorphic type variables with the boxed `any` 68 | type, and use dictionary passing for typeclasses 69 | - Monomorphization: this will be fairly complex due to higher-rank 70 | polymorphism. However, by restricting first-class polymorphic values to 71 | have function type, it's doable alongside whole-program 72 | defunctionalization for first-class polymorphic functions. 73 | - More optimizations: 74 | - [ ] Contification 75 | - [ ] Inlining 76 | - [ ] DCE 77 | - [ ] Conditional constant propagation 78 | - [ ] Stretch goal: SROA, maybe others 79 | 80 | **Backend** (unimplemented): 81 | - To do 82 | - For a first pass I will probably emit C or use LLVM, but it'd be cool to 83 | implement native codegen 84 | - Design decision: need an efficient implementation of currying 85 | 86 | 87 | 88 | 89 | 90 | 91 | -------------------------------------------------------------------------------- /poly.cabal: -------------------------------------------------------------------------------- 1 | cabal-version: 2.4 2 | name: poly 3 | version: 0.1.0.0 4 | synopsis: A compiler supporting higher-rank polymorphism 5 | 6 | -- A longer description of the package. 7 | -- description: 8 | homepage: 9 | 10 | -- A URL where users can report bugs. 11 | -- bug-reports: 12 | license: MIT 13 | license-file: LICENSE 14 | author: Mark Barbone 15 | maintainer: mark.l.barbone@gmail.com 16 | 17 | -- A copyright notice. 18 | -- copyright: 19 | -- category: 20 | 21 | executable poly 22 | main-is: Main.hs 23 | 24 | -- Modules included in this executable, other than Main. 25 | other-modules: Src, Utils, Poly, Elab, Elab.Types, Elab.Utils, Elab.Builtins, Elab.Unify 26 | 27 | default-extensions: ImportQualifiedPost, BangPatterns, BlockArguments, 28 | TupleSections, DeriveFunctor, DeriveFoldable, 29 | DeriveTraversable, GeneralizedNewtypeDeriving, 30 | LambdaCase, RecordWildCards 31 | build-depends: base, containers, unordered-containers, text, mtl, writer-cps-mtl, parsec 32 | hs-source-dirs: src 33 | default-language: Haskell2010 34 | -------------------------------------------------------------------------------- /src/Elab.hs: -------------------------------------------------------------------------------- 1 | {-# LANGUAGE StrictData #-} 2 | {-# LANGUAGE OverloadedStrings #-} 3 | {-# LANGUAGE RecursiveDo #-} 4 | 5 | -- | Type inference and elaboration to the polymorphic core language 6 | -- 7 | -- This module ties everything together to implement the mutually recursive type 8 | -- inference functions 9 | module Elab where 10 | 11 | import Utils 12 | import Elab.Types 13 | import Elab.Utils 14 | import Elab.Unify 15 | import qualified Src 16 | import qualified Poly 17 | 18 | import Data.Text qualified as T 19 | import Data.HashMap.Strict qualified as HMap 20 | import Control.Applicative 21 | import Control.Monad 22 | import Control.Monad.Trans 23 | import Data.IORef 24 | import Data.IntMap.Strict (IntMap) 25 | import Data.IntMap.Strict qualified as IMap 26 | 27 | -- | The mutually recursive type inference functions 28 | 29 | check :: Ctx -> Src.Exp -> TyVal -> M Value 30 | -- check !ctx = uncurry $ traverse deref >=> \case 31 | check !ctx e ty = do 32 | ty' <- deref ty 33 | case (e, ty') of 34 | (_, VForall n a) | Src.isSyntacticValue e -> do 35 | -- Value restriction :/ 36 | let x = VVar (ctxLvl ctx) 37 | tlam n \arg -> check (addTyToCtx n arg ctx) e (a $$ x) 38 | (Src.ELam n Nothing body, VFun a b) -> lam ctx n a \x -> 39 | check (addVarToCtx n x a ctx) body b 40 | (Src.ELam n (Just srcTy) body, VFun a b) -> lam ctx n a \x -> do 41 | a' <- srcTyToTy ctx srcTy 42 | x' <- sub ctx x a a' 43 | check (addVarToCtx n x' a' ctx) body b 44 | (Src.ELet defn body, a) -> do 45 | ctx' <- define ctx defn 46 | check ctx' body a 47 | (Src.ECase _ _, a) -> error "TODO" 48 | (Src.EIf cond then_ else_, a) -> do 49 | c <- check ctx cond (error "TODO: add built-in types to context") 50 | t <- check ctx then_ a 51 | e <- check ctx else_ a 52 | pure (error "TODO") 53 | _ -> do 54 | (tm, a) <- infer ctx e 55 | sub ctx tm a ty 56 | 57 | infer :: Ctx -> Src.Exp -> M (Value, TyVal) 58 | infer !ctx e = case e of 59 | Src.ELit x -> do 60 | pure (lit x, error "TODO: add built-in types to context") 61 | Src.EVar n -> case HMap.lookup n (boundVars ctx) of 62 | Nothing -> typeError $ "variable " <> n <> " not in scope" 63 | Just (val,ty) -> pure (val,ty) 64 | Src.EAnnot e' srcTy -> do 65 | ty <- srcTyToTy ctx srcTy 66 | tm <- check ctx e' ty 67 | pure (tm, ty) 68 | Src.EApp f arg -> do 69 | (f', funcTy) <- infer ctx f 70 | apply ctx funcTy f' arg 71 | Src.ELam n srcTy body -> do 72 | a <- maybe (VHole <$> freshHole (ctxLvl ctx)) (srcTyToTy ctx) srcTy 73 | fmap (VFun a) <$> inferLam ctx n a \x -> 74 | infer (addVarToCtx n x a ctx) body 75 | Src.ELet defn body -> do 76 | ctx' <- define ctx defn 77 | infer ctx' body 78 | Src.ECase _ _ -> error "TODO" 79 | Src.EIf cond then_ else_ -> do 80 | c <- check ctx cond (error "TODO: add built-in types to context") 81 | (t, ty) <- infer ctx then_ 82 | e <- check ctx else_ ty 83 | pure (error "TODO", ty) 84 | 85 | -- | Helper function for the 'EApp' case in 'infer'. 86 | -- 87 | -- If 'f : fTy', 'apply ctx lvl fTy f x' returns '(f x, resultTy)' where 88 | -- 'f x : resultTy' 89 | apply :: Ctx -> TyVal -> Value -> Src.Exp -> M (Value, TyVal) 90 | apply !ctx fTy f x = deref fTy >>= \case 91 | VFun a b -> do 92 | x' <- check ctx x a 93 | fx <- app ctx b f x' 94 | pure (fx, b) 95 | VForall _ a -> do 96 | newHole <- VHole <$> freshHole (ctxLvl ctx) 97 | apply ctx (a $$ newHole) (tapp ctx f newHole) x 98 | VHole ref -> do 99 | Empty scope <- liftIO $ readIORef ref 100 | a <- VHole <$> freshHole scope 101 | b <- VHole <$> freshHole scope 102 | fill ref (Filled (VFun a b)) 103 | x' <- check ctx x a 104 | fx <- app ctx b f x' 105 | pure (fx, b) 106 | _ -> typeError "should be a function type" 107 | 108 | 109 | -- | Helper function for lets (for both check and infer directions) 110 | define :: Ctx -> Src.Defn -> M Ctx 111 | define !ctx (Src.Val n srcTy e) = do 112 | -- non-generalizing, non-recursive let 113 | ty <- srcTyToTy ctx srcTy 114 | x <- check ctx e ty 115 | x' <- letBind ctx n ty x 116 | pure (addVarToCtx n x' ty ctx) 117 | 118 | define !ctx (Src.Fun n srcTy e) = do 119 | -- generalizing letrec 120 | let ctx' = moveDownLevelCtx ctx 121 | ty <- srcTyToTy ctx' srcTy 122 | (ty', val) <- letRec ctx' n ty \val -> mdo 123 | -- fuck it, recursive do 124 | x <- check (addVarToCtx n this ty ctx') e ty 125 | (tids, ty') <- generalizeLet ctx ty 126 | let this = foldr (\tid v -> Poly.TApp v (pure (Poly.TVar tid))) val tids 127 | pure (ty', foldr (Poly.TLam "t") x tids) 128 | -- for debugging: 129 | liftIO . putStrLn . ((T.unpack n ++ " : ") ++) =<< displayTyCtx ctx ty' 130 | pure (addVarToCtx n val ty' ctx) 131 | 132 | define !ctx (Src.Datatype n constrs) = do 133 | error "TODO" 134 | 135 | 136 | -------------------------------------------------------------------------------- /src/Elab/Builtins.hs: -------------------------------------------------------------------------------- 1 | {-# LANGUAGE StrictData #-} 2 | {-# LANGUAGE OverloadedStrings #-} 3 | 4 | -- | Built-in things 5 | module Elab.Builtins where 6 | 7 | import Elab.Types 8 | 9 | -- I should add this to the context 10 | data Builtins = Builtins 11 | { tInt :: TyVal 12 | , tBool :: TyVal 13 | , tUnit :: TyVal 14 | } 15 | 16 | -- TODO: actually have built-in things in it 17 | initialCtx :: Ctx 18 | initialCtx = Ctx 19 | { ctxLvl = 0 20 | , typeTIds = mempty 21 | , typeNames = mempty 22 | , typeEnv = [] 23 | , boundVars = mempty } 24 | 25 | -------------------------------------------------------------------------------- /src/Elab/Types.hs: -------------------------------------------------------------------------------- 1 | {-# LANGUAGE StrictData #-} 2 | {-# LANGUAGE OverloadedStrings #-} 3 | 4 | -- | Datatypes used for type checking, and helper functions to convert to/from 5 | -- other datatypes. 6 | -- 7 | -- It also contains a couple generally useful helper functions 8 | module Elab.Types where 9 | 10 | import Utils 11 | import Poly qualified 12 | import Src qualified 13 | 14 | import Control.Monad.RWS.Strict 15 | import Control.Monad.State.Strict 16 | import Control.Exception (Exception, throw) 17 | import Data.Text (Text) 18 | import Data.Text qualified as T 19 | import Data.List 20 | import Data.IORef 21 | import Data.IntMap.Strict (IntMap) 22 | import Data.IntMap.Strict qualified as IMap 23 | import Data.HashMap.Strict (HashMap) 24 | import Data.HashMap.Strict qualified as HMap 25 | 26 | 27 | type Lvl = Int 28 | type Idx = Int 29 | 30 | data TyExp 31 | = TVar Idx 32 | | TFun TyExp TyExp 33 | | TForall Name TyExp 34 | | THole (IORef Hole) 35 | 36 | data TyVal 37 | = VVar Lvl 38 | | VFun TyVal TyVal 39 | | VForall Name {-# UNPACK #-} Closure 40 | | VHole (IORef Hole) 41 | 42 | data Closure = Closure [TyVal] TyExp 43 | 44 | -- | A hole in a type. 45 | data Hole 46 | = Filled TyVal 47 | | Empty Lvl 48 | | Generalized Lvl Poly.TId 49 | -- ^ The hole was filled during let-generalization. 50 | -- During let-generalization, holes get filled with 'Generalized Lvl TId' 51 | 52 | -- An invariant about the scope of holes: in '∀ x. ... [H] ...', the scope of 53 | -- the hole '[H]' never includes the type variable 'x'. 54 | 55 | 56 | -- | Read-only state used during elaboration 57 | data Ctx = Ctx 58 | { ctxLvl :: Lvl 59 | , typeTIds :: IntMap Poly.Ty 60 | , typeNames :: ~(IntMap Name) 61 | , typeEnv :: [TyVal] 62 | , boundVars :: HashMap Name (Value,TyVal) 63 | -- , boundTypes :: HashMap Name Lvl 64 | } 65 | 66 | addTyToCtx :: Name -> Poly.Ty -> Ctx -> Ctx 67 | addTyToCtx n ty Ctx{..} = Ctx 68 | { ctxLvl = ctxLvl + 1 69 | , typeTIds = IMap.insert ctxLvl ty typeTIds 70 | , typeNames = IMap.insert ctxLvl n typeNames 71 | , typeEnv = VVar ctxLvl : typeEnv 72 | , boundVars = boundVars } 73 | 74 | -- | Like addTyToCtx, but without actually adding a type. 75 | -- 76 | -- It's used in let-generalization, to identify holes local to the let binding. 77 | moveDownLevelCtx :: Ctx -> Ctx 78 | moveDownLevelCtx ctx = 79 | ctx { ctxLvl = ctxLvl ctx + 1, typeEnv = error "no type here" : typeEnv ctx } 80 | 81 | addVarToCtx :: Name -> Value -> TyVal -> Ctx -> Ctx 82 | addVarToCtx n val ty ctx = 83 | ctx { boundVars = HMap.insert n (val,ty) $ boundVars ctx } 84 | 85 | -- TODO: nicer type errors 86 | newtype TypeError = TypeError Text deriving (Show) 87 | instance Exception TypeError 88 | 89 | -- | Raise a type error 90 | typeError :: Text -> M a 91 | typeError = liftIO . throw . TypeError 92 | 93 | -- | General strategy: we first build up a `Exp' GonnaBeATy`, where each 94 | -- 'GonnaBeATy' action converts 'TyVal's to 'Poly.Ty's (using 'resolveTy'). 95 | -- 96 | -- Then at the end we use 'sequence' to run all the 'GonnaBeATy's and get an 97 | -- `Exp' Poly.Ty`. 98 | -- 99 | -- We need to do that conversion at the end bc during elaboration the types 100 | -- still have holes in them. 101 | type GonnaBeATy = StateT Int IO Poly.Ty 102 | -- ^^^^^^^^^^ for generating fresh type ids 103 | 104 | type Value = Poly.Value' GonnaBeATy 105 | 106 | -- | The elaboration monad. 107 | -- 108 | -- read-write state for fresh ids 109 | -- vvv 110 | type M = RWST () (Endo (Poly.Exp' GonnaBeATy)) Int IO 111 | -- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 112 | -- write-only state for building up 'let's in the IR 113 | 114 | runM :: M (a, Poly.Exp' GonnaBeATy) -> IO (a, Poly.Exp) 115 | runM ma = do 116 | ((a, e), s, lets) <- runRWST ma () 0 117 | (a,) <$> evalStateT (sequence $ appEndo lets e) s 118 | 119 | freshHole :: Lvl -> M (IORef Hole) 120 | freshHole l = liftIO $ newIORef (Empty l) 121 | 122 | freshId :: M Poly.Id 123 | freshId = state \i -> (Poly.Id i, i + 1) 124 | 125 | freshTId :: M Poly.TId 126 | freshTId = state \i -> (Poly.TId i, i + 1) 127 | 128 | 129 | -- | Can't spell NbE without 'evalTy' 130 | evalTy :: [TyVal] -> TyExp -> TyVal 131 | evalTy env (TVar idx) = env !! idx 132 | evalTy env (TFun a b) = VFun (evalTy env a) (evalTy env b) 133 | evalTy env (TForall n ty) = VForall n (Closure env ty) 134 | evalTy _ (THole h) = VHole h 135 | 136 | -- | Apply a closure to an argument 137 | infixr 0 $$ 138 | ($$) :: Closure -> TyVal -> TyVal 139 | Closure env body $$ arg = evalTy (arg : env) body 140 | 141 | 142 | -- | Convert a source-code 'Src.Ty' to a typechecker 'TyVal' 143 | srcTyToTy :: Ctx -> Src.Ty -> M TyVal 144 | srcTyToTy ctx t = evalTy [] <$> go [] t 145 | where 146 | -- TODO: look up types in the type context 147 | lvl = ctxLvl ctx 148 | go env ty = case ty of 149 | Src.THole -> THole <$> freshHole lvl 150 | Src.TVar name 151 | | Just idx <- name `elemIndex` env -> pure (TVar idx) 152 | | otherwise -> typeError $ "type " <> name <> " not in scope" 153 | Src.TFun a b -> TFun <$> go env a <*> go env b 154 | Src.TForall name a -> TForall name <$> go (name:env) a 155 | _ -> error "TODO" 156 | 157 | 158 | -- | Pretty-print a type. 159 | displayTyCtx :: Ctx -> TyVal -> M String 160 | displayTyCtx ctx = displayTy (ctxLvl ctx) (typeNames ctx) 161 | 162 | -- | Pretty-print a type. 163 | displayTy :: Lvl -> IntMap Name -> TyVal -> M String 164 | displayTy = go False 165 | where 166 | parens p s = if p then "(" ++ s ++ ")" else s 167 | go _ !_ !tyNames (VVar l) = 168 | pure $ T.unpack (tyNames IMap.! l) ++ showSubscript l 169 | go p !lvl !tyNames (VFun a b) = do 170 | x <- go True lvl tyNames a 171 | y <- go False lvl tyNames b 172 | pure $ parens p $ x ++ " -> " ++ y 173 | go p !lvl !tyNames (VForall n a) = do 174 | x <- go False (lvl+1) (IMap.insert lvl n tyNames) (a $$ VVar lvl) 175 | pure $ parens p $ "forall " ++ T.unpack n ++ showSubscript lvl ++ ". " ++ x 176 | go p !lvl !tyNames (VHole ref) = liftIO (readIORef ref) >>= \case 177 | -- TODO: should empty holes have names? I think they should. 178 | Empty l -> pure $ "?[at level " ++ show l ++ "]" 179 | Filled ty -> go p lvl tyNames ty 180 | Generalized _ _ -> error "internal error" 181 | 182 | 183 | -- | Convert a typechecker 'TyVal' to an IR 'Poly.Ty'. 184 | resolveTy :: Ctx -> TyVal -> StateT Int IO Poly.Ty 185 | resolveTy ctx@Ctx{..} ty = case ty of 186 | VVar lvl -> pure (typeTIds IMap.! lvl) 187 | VFun a b -> Poly.TFun <$> resolveTy ctx a <*> resolveTy ctx b 188 | VForall n a -> do 189 | tid <- state \i -> (Poly.TId i, i + 1) 190 | let ctx' = addTyToCtx n (Poly.TVar tid) ctx 191 | Poly.TForall n tid <$> resolveTy ctx' (a $$ VVar ctxLvl) 192 | VHole ref -> liftIO (readIORef ref) >>= \case 193 | Empty _ -> pure Poly.TUnit -- error "ambiguous type" -- TODO better error 194 | Filled a -> resolveTy ctx a 195 | Generalized _ tid -> pure (Poly.TVar tid) 196 | 197 | 198 | 199 | -------------------------------------------------------------------------------- /src/Elab/Unify.hs: -------------------------------------------------------------------------------- 1 | {-# LANGUAGE StrictData #-} 2 | {-# LANGUAGE OverloadedStrings #-} 3 | 4 | -- | Unification and polymorphic subtyping 5 | module Elab.Unify where 6 | 7 | import Utils 8 | import Elab.Types 9 | import Elab.Utils 10 | 11 | import Data.IntMap.Strict (IntMap) 12 | import Data.IntMap.Strict qualified as IMap 13 | import Data.IORef 14 | import qualified Data.Text as T 15 | import Control.Applicative 16 | import Control.Monad 17 | import Control.Monad.IO.Class 18 | 19 | -- | only need some of the context during unification 20 | data UnifyCtx = UnifyCtx { uctxLvl :: Lvl, uctxTypeNames :: ~(IntMap Name) } 21 | 22 | addTyToUnifyCtx :: Name -> UnifyCtx -> UnifyCtx 23 | addTyToUnifyCtx n UnifyCtx{..} = UnifyCtx 24 | { uctxLvl = uctxLvl + 1 25 | , uctxTypeNames = IMap.insert uctxLvl n uctxTypeNames } 26 | 27 | ctxToUnifyCtx :: Ctx -> UnifyCtx 28 | ctxToUnifyCtx Ctx{..} = UnifyCtx ctxLvl typeNames 29 | 30 | -- | Occurs check and scope check 31 | unifyHolePrechecks :: UnifyCtx -> IORef Hole -> Lvl -> TyVal -> M () 32 | unifyHolePrechecks !ctx ref scope = go ctx 33 | where 34 | initialLvl = uctxLvl ctx 35 | go !ctx = deref >=> \case 36 | VVar lvl -> when (lvl >= scope && lvl < initialLvl) $ do 37 | let name = uctxTypeNames ctx IMap.! lvl 38 | typeError $ "type variable " <> name <> " escaping its scope" 39 | VFun a b -> go ctx a >> go ctx b 40 | VForall n a -> go (addTyToUnifyCtx n ctx) (a $$ VVar (uctxLvl ctx)) 41 | VHole h | h == ref -> typeError "occurs check: can't make infinite type" 42 | VHole h -> do 43 | Empty l <- liftIO $ readIORef h 44 | when (l > scope) $ fill h (Empty scope) 45 | 46 | -- | Unify two monotypes 47 | -- 48 | -- 'unify a b' fills in the holes in 'a' and 'b' so that 'a = b' 49 | unify :: UnifyCtx -> TyVal -> TyVal -> M () 50 | unify !ctx t1 t2 = liftA2 (,) (deref t1) (deref t2) >>= \case 51 | (VHole refX, y) -> unifyHoleTy ctx refX y 52 | (x, VHole refY) -> unifyHoleTy ctx refY x 53 | (VVar x, VVar y) | x == y -> pure () 54 | (VFun a b, VFun a' b') -> unify ctx a a' >> unify ctx b b' 55 | (VForall name a, VForall _ b) -> do 56 | let x = VVar (uctxLvl ctx) 57 | unify (addTyToUnifyCtx name ctx) (a $$ x) (b $$ x) 58 | (x, y) -> do 59 | let disp = displayTy (uctxLvl ctx) (uctxTypeNames ctx) 60 | x' <- disp x 61 | y' <- disp y 62 | typeError $ "mismatch between " <> T.pack x' <> " and " <> T.pack y' 63 | 64 | -- | unifyHoleTy ref ty: fill in ref to satisfy 'ref = ty' 65 | unifyHoleTy :: UnifyCtx -> IORef Hole -> TyVal -> M () 66 | unifyHoleTy !ctx refX = deref >=> \case 67 | VHole refY | refX == refY -> pure () 68 | ty -> do 69 | Empty lvl <- liftIO $ readIORef refX 70 | unifyHolePrechecks ctx refX lvl ty 71 | fill refX (Filled ty) 72 | 73 | -- | Fancy subsumption for potentially polymorphic types 74 | -- 75 | -- 'sub ctx tm a b' fills in the holes in 'a' and 'b' so that 'a <: b' 76 | -- 77 | -- It also takes a term 'tm : a' and returns a term with type 'b' 78 | sub :: Ctx -> Value -> TyVal -> TyVal -> M Value 79 | sub !ctx tm t1 t2 = liftA2 (,) (deref t1) (deref t2) >>= \case 80 | (VHole a, b) -> do 81 | Empty scope <- liftIO $ readIORef a 82 | a' <- freshHole scope 83 | tm' <- subHoleTy ctx a tm a' b 84 | contents <- liftIO $ readIORef a' 85 | fill a contents 86 | pure tm' 87 | (a, VHole b) -> do 88 | Empty scope <- liftIO $ readIORef b 89 | b' <- freshHole scope 90 | tm' <- subTyHole ctx b tm a b' 91 | contents <- liftIO $ readIORef b' 92 | fill b contents 93 | pure tm' 94 | (a, VForall n b) -> do 95 | let x = VVar (ctxLvl ctx) 96 | tlam n \arg -> sub (addTyToCtx n arg ctx) tm a (b $$ x) 97 | (VForall _ a, b) -> do 98 | newHole <- VHole <$> freshHole (ctxLvl ctx) 99 | sub ctx (tapp ctx tm newHole) (a $$ newHole) b 100 | (VFun a a', VFun b b') -> lam ctx "eta" b \arg -> do 101 | arg' <- sub ctx arg b a 102 | tm' <- app ctx a' tm arg' 103 | sub ctx tm' a' b' 104 | (a, b) -> unify (ctxToUnifyCtx ctx) a b >> pure tm 105 | 106 | -- | 'subHoleTy ctx ref tm hole ty': fill in hole so that 'hole <: ty' 107 | -- 108 | -- Also takes a term 'tm : hole' and coerces it to type 'ty' 109 | subHoleTy :: Ctx -> IORef Hole -> Value -> IORef Hole -> TyVal -> M Value 110 | subHoleTy !ctx ref tm hole ty = deref ty >>= \case 111 | VForall n b -> do 112 | let x = VVar (ctxLvl ctx) 113 | tlam n \arg -> subHoleTy (addTyToCtx n arg ctx) ref tm hole (b $$ x) 114 | VFun b b' -> do 115 | Empty scope <- liftIO $ readIORef hole 116 | a <- freshHole scope 117 | a' <- freshHole scope 118 | fill hole $ Filled (VFun (VHole a) (VHole a')) 119 | lam ctx "eta" b \arg -> do 120 | arg' <- subTyHole ctx ref arg b a 121 | tm' <- app ctx (VHole a') tm arg' 122 | subHoleTy ctx ref tm' a' b' 123 | b -> do 124 | Empty scope <- liftIO $ readIORef ref 125 | unifyHolePrechecks (ctxToUnifyCtx ctx) ref scope b 126 | fill hole (Filled b) 127 | pure tm 128 | 129 | -- | 'subTyHole ctx ref tm ty hole': fill in hole so that 'ty <: hole' 130 | -- 131 | -- Also takes a term 'tm : ty' and coerces it to type 'hole' 132 | subTyHole :: Ctx -> IORef Hole -> Value -> TyVal -> IORef Hole -> M Value 133 | subTyHole !ctx ref tm ty hole = deref ty >>= \case 134 | VForall _ a -> do 135 | newHole <- VHole <$> freshHole (ctxLvl ctx) 136 | subTyHole ctx ref (tapp ctx tm newHole) (a $$ newHole) hole 137 | VFun a a' -> do 138 | Empty scope <- liftIO $ readIORef hole 139 | b <- freshHole scope 140 | b' <- freshHole scope 141 | fill hole $ Filled (VFun (VHole b) (VHole b')) 142 | lam ctx "eta" (VHole b) \arg -> do 143 | arg' <- subHoleTy ctx ref arg b a 144 | tm' <- app ctx a' tm arg' 145 | subTyHole ctx ref tm' a' b' 146 | a -> do 147 | Empty scope <- liftIO $ readIORef ref 148 | unifyHolePrechecks (ctxToUnifyCtx ctx) ref scope a 149 | fill hole (Filled a) 150 | pure tm 151 | 152 | 153 | 154 | -------------------------------------------------------------------------------- /src/Elab/Utils.hs: -------------------------------------------------------------------------------- 1 | {-# LANGUAGE StrictData #-} 2 | {-# LANGUAGE OverloadedStrings #-} 3 | {-# LANGUAGE RecursiveDo #-} 4 | 5 | -- | Utils used by the elaborator 6 | -- 7 | -- * Smart constructors for the 'Poly' IR, including getting the type of a 8 | -- built-in 9 | -- * Some convenient helper functions for dealing with typechecker 'TyVal's 10 | module Elab.Utils where 11 | 12 | import Poly hiding (Value, Ty(..)) 13 | import Poly qualified 14 | import Utils 15 | import Elab.Types 16 | 17 | import Control.Monad.State.Strict 18 | import Control.Monad.Writer.CPS 19 | import Data.IORef 20 | -- import Debug.Trace 21 | 22 | -- some local helper functions 23 | runLocally :: (a -> Exp' GonnaBeATy) -> M a -> M (a, Exp' GonnaBeATy) 24 | runLocally k ma = pass $ do 25 | (a, lets) <- listen ma 26 | pure ((a, appEndo lets (k a)), const mempty) 27 | 28 | letBindComp :: Ctx -> Name -> TyVal -> Comp' GonnaBeATy -> M Value 29 | letBindComp ctx n ty e = do 30 | ident <- freshId 31 | tell $ Endo $ Let n ident (resolveTy ctx ty) e 32 | pure (Var ident) 33 | 34 | 35 | -- smart constructors for IR values 36 | var :: Id -> Value 37 | var = Var 38 | 39 | lit :: Int -> Value 40 | lit = Lit 41 | 42 | -- Probably not :/ 43 | letBind :: Ctx -> Name -> TyVal -> Value -> M Value 44 | letBind ctx n ty val = letBindComp ctx n ty (Val val) 45 | 46 | letRec :: Ctx -> Name -> TyVal -> (Value -> M (a, Value)) -> M (a, Value) 47 | letRec ctx n ty f = do 48 | ident <- freshId 49 | ((a, _), e) <- runLocally (Comp . Val . snd) $ f (Var ident) 50 | -- lets should be mempty 51 | case e of 52 | Comp (Val val) -> do 53 | tell $ Endo $ LetRec n ident (resolveTy ctx ty) val 54 | pure (a, Var ident) 55 | _ -> error "internal error: should only letrec a value" 56 | 57 | app :: Ctx -> TyVal -> Value -> Value -> M Value 58 | app ctx ty f x = letBindComp ctx "tmp" ty (App f x) 59 | 60 | tapp :: Ctx -> Value -> TyVal -> Value 61 | tapp ctx x a = TApp x (resolveTy ctx a) 62 | 63 | -- | Like 'lam' but with a more convenient type signature for the 'infer' function 64 | inferLam :: Ctx -> Name -> TyVal -> (Value -> M (Value, TyVal)) -> M (Value, TyVal) 65 | inferLam ctx n a f = do 66 | ident <- freshId 67 | ((_, b), body) <- runLocally (Comp . Val . fst) $ f (Var ident) 68 | pure (Lam n ident (resolveTy ctx a) body, b) 69 | 70 | lam :: Ctx -> Name -> TyVal -> (Value -> M Value) -> M Value 71 | lam ctx n a f = do 72 | ident <- freshId 73 | (_, body) <- runLocally (Comp . Val) $ f (Var ident) 74 | pure (Lam n ident (resolveTy ctx a) body) 75 | -- fst <$> inferLam n a \arg -> (,b) <$> f arg 76 | 77 | tlam :: Name -> (Poly.Ty -> M Value) -> M Value 78 | tlam n tm = do 79 | tid <- freshTId 80 | (_, body) <- runLocally (Comp . Val) $ tm (Poly.TVar tid) 81 | case body of 82 | Comp (Val v) -> pure (TLam n tid v) 83 | _ -> typeError "Value restriction: not a value!" 84 | 85 | -- | Dereference if it's a hole. 86 | -- 87 | -- If it returns a 'VHole ref', then 'ref' is guaranteed to be empty 88 | deref :: TyVal -> M TyVal 89 | deref (VHole ref) = liftIO $ go ref 90 | where 91 | go r = readIORef r >>= \case 92 | Filled (VHole ref') -> do 93 | contents <- go ref' 94 | writeIORef r (Filled contents) 95 | pure contents 96 | Filled contents -> pure contents 97 | _ -> pure $ VHole r 98 | deref x = pure x 99 | 100 | -- | Fill an empty hole 101 | fill :: IORef Hole -> Hole -> M () 102 | fill ref contents = liftIO $ modifyIORef' ref \case 103 | Empty _ -> contents 104 | _ -> error "internal error: can only fill empty holes" 105 | 106 | -- | Generalize a let-binding 107 | -- 108 | -- Returns a tuple of: 109 | -- - TIds the holes were filled with 110 | -- - generalized type 111 | generalizeLet :: Ctx -> TyVal -> M ([Poly.TId], TyVal) 112 | generalizeLet ctx ty = mdo 113 | let initialLvl = ctxLvl ctx 114 | let base = initialLvl + 1 -- the type lives a level higher 115 | -- go finds holes and also shifts levels and also quotes TyVal to TyExp 116 | -- important: it returns its result lazily... 117 | let go :: Lvl -> TyVal -> StateT Lvl (WriterT [Poly.TId] M) TyExp 118 | go lvl = lift . lift . deref >=> \case 119 | VVar l -> pure $ case compare l base of 120 | LT -> TVar (lvl - l - 1) 121 | EQ -> error "internal error: should not have type vars of this level" 122 | GT -> TVar (lvl - (l + newLvl - base) - 1) 123 | VFun a b -> TFun <$> go lvl a <*> go lvl b 124 | VForall n a -> TForall n <$> go (lvl + 1) (a $$ VVar lvl) 125 | VHole hole -> liftIO (readIORef hole) >>= \case 126 | Empty s | s > initialLvl -> do 127 | -- add another TLam and another TApp 128 | tid <- lift $ lift freshTId 129 | newBinderLvl <- get 130 | modify' (+1) 131 | tell [tid] 132 | -- set the hole to Generalized 133 | lift $ lift $ fill hole (Generalized newBinderLvl tid) 134 | pure (TVar (lvl - newBinderLvl - 1)) 135 | Empty _ -> pure (THole hole) 136 | Generalized l _ -> pure (TVar (lvl - l - 1)) 137 | Filled _ -> error "oh no" 138 | -- ... because it uses the newLvl it returns 139 | ((tyExp, newLvl), tids) <- runWriterT $ go newLvl ty `runStateT` ctxLvl ctx 140 | let tyExp' = iter (newLvl - ctxLvl ctx) (TForall "t") tyExp 141 | tyVal = evalTy (typeEnv ctx) tyExp' 142 | pure (tids, tyVal) 143 | 144 | 145 | -------------------------------------------------------------------------------- /src/Main.hs: -------------------------------------------------------------------------------- 1 | {-# LANGUAGE OverloadedStrings #-} 2 | 3 | module Main where 4 | 5 | import Src (exprParser) 6 | import Text.Parsec (parse, eof) 7 | import Elab qualified 8 | import Elab.Types qualified as Elab 9 | import Elab.Builtins qualified as Elab 10 | import Poly 11 | 12 | main :: IO () 13 | main = do 14 | contents <- getContents 15 | case parse (exprParser <* eof) "stdin" contents of 16 | Left err -> print err 17 | Right e -> do 18 | print e 19 | let ctx = Elab.initialCtx 20 | (ty, ir) <- Elab.runM $ do 21 | (val, ty) <- Elab.infer ctx e 22 | (, Comp (Val val)) <$> Elab.displayTyCtx ctx ty 23 | putStrLn $ "result : " ++ ty 24 | -- TODO: pretty-printers 25 | -- print ir 26 | 27 | -------------------------------------------------------------------------------- /src/Poly.hs: -------------------------------------------------------------------------------- 1 | {-# LANGUAGE StrictData #-} 2 | 3 | 4 | -- | The polymorphic core language. 5 | -- 6 | -- Roughly, ANF'd System F with no impredicative polymorphism 7 | module Poly where 8 | 9 | import Data.Text qualified as T 10 | import Data.IntMap.Strict qualified as IMap 11 | import Utils 12 | 13 | -- Types 14 | ------------------------------------------------------------ 15 | 16 | newtype TId = TId Int deriving (Show, Eq, Ord) 17 | 18 | data Ty = TVar TId 19 | | TFun Ty Ty 20 | | TForall Name TId Ty 21 | -- TODO: these can all eventually be coalesced into a single variant for 22 | -- rigid type constructors 23 | | TUnit 24 | | TInt 25 | | TPair Ty Ty 26 | deriving (Show, Eq) 27 | 28 | 29 | -- Terms 30 | ------------------------------------------------------------ 31 | 32 | newtype Id = Id Int deriving (Show, Eq, Ord) 33 | 34 | -- | Values are characterized by no runtime behavior (except possibly 35 | -- allocation) 36 | type Value = Value' Ty 37 | data Value' ty 38 | = Var Id 39 | | Lit Int 40 | -- | Pair (Value' ty) (Value' ty) 41 | | Lam Name Id ty (Exp' ty) 42 | | TLam Name TId (Value' ty) -- Value restriction! 43 | | TApp (Value' ty) ty -- hmmm 44 | deriving (Show, Eq, Functor, Foldable, Traversable) 45 | 46 | -- | Computations 47 | type Comp = Comp' Ty 48 | data Comp' ty 49 | = Val (Value' ty) 50 | | App (Value' ty) (Value' ty) 51 | -- TODO 52 | -- | Builtin Builtin [Value] 53 | -- | If (Value' ty) (Exp' ty) (Exp' ty) -- TODO 54 | deriving (Show, Eq, Functor, Foldable, Traversable) 55 | 56 | -- | Expressions 57 | type Exp = Exp' Ty 58 | data Exp' ty 59 | = Let Name Id ty (Comp' ty) (Exp' ty) 60 | -- only recursively define values. 61 | | LetRec Name Id ty (Value' ty) (Exp' ty) 62 | | Comp (Comp' ty) 63 | deriving (Show, Eq, Functor, Foldable, Traversable) 64 | 65 | 66 | -- TODO: pretty-printing 67 | 68 | prettyTy :: Ty -> String 69 | prettyTy = go IMap.empty False 70 | where 71 | parens p s = if p then "(" ++ s ++ ")" else s 72 | go ctx _ (TVar (TId i)) = IMap.findWithDefault "UNNAMED" i ctx ++ showSubscript i 73 | go ctx p (TFun a b) = parens p $ go ctx True a ++ " -> " ++ go ctx False b 74 | go ctx p (TForall n (TId i) ty) = parens p $ "forall " ++ T.unpack n ++ showSubscript i ++ ". " ++ go (IMap.insert i (T.unpack n) ctx) False ty 75 | go ctx _ (TPair a b) = "(" ++ go ctx False a ++ ", " ++ go ctx False b ++ ")" 76 | go _ _ TUnit = "unit" 77 | go _ _ TInt = "int" 78 | 79 | 80 | -------------------------------------------------------------------------------- /src/Src.hs: -------------------------------------------------------------------------------- 1 | {-# LANGUAGE StrictData #-} 2 | {-# LANGUAGE OverloadedStrings #-} 3 | 4 | -- | The source AST and parser. Meant to be imported qualified 5 | module Src where 6 | 7 | import Utils 8 | 9 | import Control.Applicative 10 | import Control.Monad 11 | import Data.Text (Text) 12 | import Data.Text qualified as T 13 | import Data.Functor 14 | import Text.Parsec (Parsec, parse, eof, try, optionMaybe, option, (), oneOf) 15 | import Text.Parsec.Char (letter, alphaNum, char, string) 16 | import Text.Parsec.Token 17 | import Text.Parsec.Expr 18 | import Data.Maybe 19 | 20 | data Ty 21 | = TVar Name 22 | | THole -- _ in a type 23 | | TPair Ty Ty 24 | | TFun Ty Ty 25 | | TForall Name Ty 26 | deriving (Eq, Show) 27 | 28 | data Pat 29 | = PWildcard 30 | | PVar Name 31 | | PConstr Name [Pat] 32 | deriving (Eq, Show) 33 | 34 | data Exp 35 | = EVar Name 36 | | ELit Int 37 | | EApp Exp Exp 38 | | ELam Name (Maybe Ty) Exp 39 | | EAnnot Exp Ty 40 | | ECase Exp [(Pat, Exp)] 41 | | EIf Exp Exp Exp 42 | | ELet Defn Exp 43 | deriving (Eq, Show) 44 | 45 | data Defn 46 | -- fun is generalizing letrec, val is non-generalizing non-recursive let 47 | = Val Name Ty Exp 48 | | Fun Name Ty Exp 49 | -- Local datatypes! 50 | | Datatype Name [(Name,[Ty])] 51 | deriving (Eq, Show) 52 | 53 | isSyntacticValue :: Exp -> Bool 54 | isSyntacticValue ELam{} = True 55 | isSyntacticValue (EAnnot e _) = isSyntacticValue e 56 | isSyntacticValue _ = False 57 | 58 | -- From here on, it's just parsing/pretty-printing 59 | 60 | -- TODO: pretty-printing 61 | 62 | -- parser :: Parsec String () [Top] 63 | exprParser :: Parsec String () Exp 64 | exprParser = whiteSpace *> expr 65 | where 66 | TokenParser{..} = makeTokenParser LanguageDef{ 67 | commentStart = "(*" 68 | , commentEnd = "*)" 69 | , commentLine = "--" 70 | , nestedComments = True 71 | , identStart = letter <|> char '_' 72 | , identLetter = alphaNum <|> char '_' 73 | , opStart = oneOf "!#$%&*+./<=>?@\\^|-~:" 74 | , opLetter = oneOf "!#$%&*+./<=>?@\\^|-~:" 75 | , reservedNames = 76 | ["forall", "_", "if", "then", "else", "val", "let", "in", "end", "fun", "fn"] 77 | , reservedOpNames = ["->", "=>", "=", ":", "*", "+", "-", ".", "|"] 78 | , caseSensitive = True 79 | } 80 | 81 | ident = T.pack <$> identifier 82 | 83 | letExpr = do 84 | reserved "let" 85 | ds <- many defn 86 | reserved "in" 87 | exp <- expr 88 | reserved "end" 89 | pure $ foldr ELet exp ds 90 | 91 | atomic = EVar <$> ident 92 | <|> ELit . fromInteger <$> natural 93 | <|> try (parens (pure $ EVar "()")) 94 | <|> parens expr 95 | <|> letExpr 96 | "simple expression" 97 | 98 | factor = foldl1 EApp <$> some atomic 99 | 100 | -- TODO: make if, fn, case prefix operators 101 | arithExpr = buildExpressionParser table factor 102 | "arithmetic expression" 103 | table = [[op "*"], [op "+", op "-"]] 104 | op o = Infix (reservedOp o $> \x y -> EApp (EApp (EVar (T.pack o)) x) y) AssocLeft 105 | 106 | annotExpr = do 107 | e <- arithExpr 108 | maybe e (EAnnot e) <$> optionMaybe (reservedOp ":" *> typ) 109 | ifExpr = do 110 | reserved "if" 111 | cond <- expr 112 | reserved "then" 113 | trueBranch <- expr 114 | reserved "else" 115 | falseBranch <- expr 116 | pure $ EIf cond trueBranch falseBranch 117 | lambda = do 118 | reserved "fn" 119 | args <- some param 120 | reservedOp "=>" 121 | body <- expr 122 | pure $ foldr (uncurry ELam) body args 123 | expr = ifExpr <|> lambda <|> annotExpr "expression" 124 | 125 | param :: Parsec String () (Name, Maybe Ty) 126 | param = (, Nothing) <$> ident 127 | <|> parens ((,) <$> ident <* reservedOp ":" <*> (Just <$> typ)) 128 | "parameter" 129 | 130 | simpleTyp = reserved "_" $> THole 131 | <|> TVar <$> ident 132 | <|> parens typ 133 | typ :: Parsec String () Ty 134 | typ = liftA2 maybe id TFun <$> simpleTyp <*> optionMaybe (reservedOp "->" *> typ) 135 | <|> flip (foldr TForall) <$> (reserved "forall" *> many ident) <*> (reservedOp "." *> typ) 136 | "type" 137 | 138 | defn = val <|> fun "definition" 139 | val = do 140 | reserved "val" 141 | n <- ident 142 | t <- option THole (reservedOp ":" *> typ) 143 | reservedOp "=" 144 | v <- expr 145 | pure $ Val n t v 146 | fun = do 147 | reserved "fun" 148 | n <- ident 149 | args <- some param 150 | retTy <- option THole (reservedOp ":" *> typ) 151 | reservedOp "=" 152 | body <- expr 153 | let ty = foldr (\(_, argTy) -> TFun (fromMaybe THole argTy)) retTy args 154 | pure $ Fun n ty (foldr (uncurry ELam) body args) 155 | 156 | 157 | -------------------------------------------------------------------------------- /src/Utils.hs: -------------------------------------------------------------------------------- 1 | -- | Some general helper functions 2 | module Utils where 3 | 4 | import Data.List 5 | import Data.Char 6 | import Data.Text (Text) 7 | 8 | type Name = Text 9 | 10 | -- TODO: iter does not fuse, due to cross-module inlining issues. 11 | -- Probably worth filing a GHC bug report. 12 | iter :: Int -> (a -> a) -> a -> a 13 | iter count f x = iterate' f x !! count 14 | 15 | -- | Show an integer as a subscript: 'showSubscript 123 == "₁₂₃"' 16 | showSubscript :: Int -> String 17 | showSubscript = map subscript . show 18 | where subscript c = "₀₁₂₃₄₅₆₇₈₉" !! (ord c - ord '0') 19 | -------------------------------------------------------------------------------- /test.txt: -------------------------------------------------------------------------------- 1 | let 2 | fun id x = let val y = x in y end 3 | fun f (g : forall a. a -> a) = 3 + g 4 4 | in f id end 5 | --------------------------------------------------------------------------------