├── .gitignore ├── LICENSE ├── README.md ├── Setup.hs ├── explanation.md ├── higher-order-unification.cabal ├── src ├── Client.hs └── Unification.hs └── stack.yaml /.gitignore: -------------------------------------------------------------------------------- 1 | .stack-work 2 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright (c) 2017 Danny Gratzer 2 | 3 | Permission is hereby granted, free of charge, to any person obtaining 4 | a copy of this software and associated documentation files (the 5 | "Software"), to deal in the Software without restriction, including 6 | without limitation the rights to use, copy, modify, merge, publish, 7 | distribute, sublicense, and/or sell copies of the Software, and to 8 | permit persons to whom the Software is furnished to do so, subject to 9 | the following conditions: 10 | 11 | The above copyright notice and this permission notice shall be included 12 | in all copies or substantial portions of the Software. 13 | 14 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 15 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 16 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. 17 | IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY 18 | CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, 19 | TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE 20 | SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 21 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ## higher-order-unification 2 | 3 | A simple, concise implementation of Huet's algorithm. Written because 4 | it's difficult to translate the simple prose explanations of 5 | algorithms often adopted by the unification community to a working 6 | piece of code. The code is documented fully in `explanation.md`. 7 | 8 | An example of how higher-order unification might be used may be found 9 | in `src/Client.hs` which provides a simple type-inference/checking 10 | algorithm for a dependently typed language with `Type : Type`. 11 | -------------------------------------------------------------------------------- /Setup.hs: -------------------------------------------------------------------------------- 1 | import Distribution.Simple 2 | main = defaultMain 3 | -------------------------------------------------------------------------------- /explanation.md: -------------------------------------------------------------------------------- 1 | ## An Explanation of Unification.hs 2 | 3 | In order to make this code useful to others, I would like to take the 4 | time to explain exactly how it works. In this file, we will go through 5 | the unification algorithm being used and how it is implemented in the 6 | code. 7 | 8 | ### The Problem 9 | 10 | Before beginning, it's worth clarifying the problem that we're 11 | attempting to solve with this code, namely, what is higher order 12 | unification? The simple answer is that we want to take two terms with 13 | "holes" in them, called metavariables. We then want to figure out how 14 | to replace those metavariables with programs so that the two terms, 15 | once fully filled in, evaluate to the same term. Our language will 16 | contain the following constructs, 17 | 18 | 1. Variables 19 | 2. Functions (lambdas) and correspondingly application 20 | 3. Metavariables 21 | 4. Function types, in the typical style for a dependently typed 22 | language: pi types 23 | 5. Universe types 24 | 25 | This was originally designed to be part of a typechecker for a 26 | particular dependently typed language, hence the pi types and 27 | universes, but they can safely be ignored and treated as particular 28 | constants. 29 | 30 | The main issue is, that it's actually undecidable to do this in the 31 | general case. It's therefore only possible to implement a semidecision 32 | procedure that performs relatively well in practice. By a semidecision 33 | procedure, I mean an algorithm that will terminate with a solution 34 | when possible and reject only some of the time. This procedure is 35 | called Huet's algorithm and it's essentially a refinement of the 36 | following algorithm 37 | 38 | 1. Generate a solution 39 | 2. Test it 40 | 3. If the solution was correct, stop 41 | 4. Else, go to 1 42 | 43 | This is not exactly the most sophisticated algorithm but it does have 44 | the benefit of being obviously a correct semidecision procedure for 45 | our problem. The idea with Huet's algorithm is to gradually produce a 46 | solution and to only produce solutions that are at least not 47 | obviously wrong. By doing this, we drastically cut down the search 48 | space and produce answers reasonably quickly. 49 | 50 | ### The Set Up 51 | 52 | To begin with, we introduce the tools that we will need to even code 53 | up the unification algorithm. The first critical point is how we 54 | define the language we're unifying in the first place. I will 55 | represent terms using the so-called "locally nameless" approach. This 56 | means that we use 57 | [de Bruijn](https://en.wikipedia.org/wiki/De_Bruijn_index) to 58 | represent bound variables. However, for free variables we will 59 | generate globally unique identifiers to simplify the process of 60 | carrying around contexts or the like. This does mean that our AST has 61 | two different constructors for variables, free and bound. 62 | 63 | ``` haskell 64 | type Id = Int 65 | type Index = Int 66 | data Term = FreeVar Id 67 | | LocalVar Index 68 | | MetaVar Id 69 | | Uni 70 | | Ap Term Term 71 | | Lam Term 72 | | Pi Term Term 73 | deriving (Eq, Show, Ord) 74 | ``` 75 | 76 | Since we're using de Bruijn indices, we also need to define a crucial 77 | helper function called `raise :: Int -> Term -> Term`. This raises all 78 | the variables wrapped in a `LocalVar` constructor up by `i`. This is 79 | done by recursing over the inputted term. 80 | 81 | ``` haskell 82 | raise :: Int -> Term -> Term 83 | raise = go 0 84 | where go lower i t = case t of 85 | FreeVar i -> FreeVar i 86 | LocalVar j -> if i > lower then LocalVar (i + j) else LocalVar j 87 | MetaVar i -> MetaVar i 88 | Uni -> Uni 89 | Ap l r -> go lower i l `Ap` go lower i r 90 | Lam body -> Lam (go (lower + 1) i body) 91 | Pi tp body -> Pi (go lower i tp) (go (lower + 1) i body) 92 | ``` 93 | 94 | Using this, we can define substitution on terms. This will be useful 95 | later on directly. For this, we first define the notion of replacing a 96 | de Bruijn variable with a term. 97 | 98 | ``` haskell 99 | subst :: Term -> Int -> Term -> Term 100 | subst new i t = case t of 101 | FreeVar i -> FreeVar i 102 | LocalVar j -> case compare j i of 103 | LT -> LocalVar j 104 | EQ -> new 105 | GT -> LocalVar (j - 1) 106 | MetaVar i -> MetaVar i 107 | Uni -> Uni 108 | Ap l r -> subst new i l `Ap` subst new i r 109 | Lam body -> Lam (subst (raise 1 new) (i + 1) body) 110 | Pi tp body -> Pi (subst new i tp) (subst (raise 1 new) (i + 1) body) 111 | ``` 112 | 113 | Notice that we have used `raise` to escape `new` as we go under 114 | binders to avoid capturing variables. Similarly, since we're removing 115 | a binding level, if we have any de Bruijn variables that refer to a 116 | binding site outside of the one we're working with we have to lower it 117 | to compensate. That is the reason for the line `GT -> LocalVar (j - 1)`. 118 | Apart from these two complications, substitution is just hunting for 119 | all occurrences of `LocalVar i` and replacing it with `new`. However, 120 | we also have this metavariables so it makes sense that we have a 121 | notion of substitution for these as well. It's simpler than the above 122 | substitution function because we don't have to worry about lowering 123 | variables that might be affected by deleting a metavariable binding 124 | since we're using globally unique identifiers for them. 125 | 126 | ``` haskell 127 | substMV :: Term -> Id -> Term -> Term 128 | substMV new i t = case t of 129 | FreeVar i -> FreeVar i 130 | LocalVar i -> LocalVar i 131 | MetaVar j -> if i == j then new else MetaVar j 132 | Uni -> Uni 133 | Ap l r -> substMV new i l `Ap` substMV new i r 134 | Lam body -> Lam (substMV (raise 1 new) i body) 135 | Pi tp body -> Pi (substMV new i tp) (substMV (raise 1 new) i body) 136 | ``` 137 | 138 | Now there are only a few more utility functions left before we can get 139 | to the actual unification. We need a function to gather all of the 140 | metavariables in a term. For this we use a `Set` from `containers` and 141 | just fold over the structure of the term. 142 | 143 | ``` haskell 144 | metavars :: Term -> S.Set Id 145 | metavars t = case t of 146 | FreeVar i -> S.empty 147 | LocalVar i -> S.empty 148 | MetaVar j -> S.singleton j 149 | Uni -> S.empty 150 | Ap l r -> metavars l <> metavars r 151 | Lam body -> metavars body 152 | Pi tp body -> metavars tp <> metavars body 153 | ``` 154 | 155 | Another useful function will be necessary for enforcing the condition 156 | that we only unify metavariables with closed terms (no capturing). In 157 | order to handle this, we will need to check that a given term is 158 | closed. This is as simple as looking to see if it mentions the 159 | `FreeVar` constructor since `LocalVar` is used for only bound 160 | variables by invariant. 161 | 162 | ``` haskell 163 | isClosed :: Term -> Bool 164 | isClosed t = case t of 165 | FreeVar i -> False 166 | LocalVar i -> True 167 | MetaVar j -> True 168 | Uni -> True 169 | Ap l r -> isClosed l && isClosed r 170 | Lam body -> isClosed body 171 | Pi tp body -> isClosed tp && isClosed body 172 | ``` 173 | 174 | The last complicated utility function is `reduce`. This is actually 175 | just a simple interpreter for the language we defined earlier. It 176 | essentially repeatedly searches for `Ap (Lam ...) ...` and when it 177 | finds such an occurrence substitutes the argument into the body of the 178 | function as one might expect. I have made this function reduce 179 | everywhere because it seems to provide a significant performance 180 | improvement in many cases. 181 | 182 | ``` haskell 183 | reduce :: Term -> Term 184 | reduce t = case t of 185 | FreeVar i -> FreeVar i 186 | LocalVar j -> LocalVar j 187 | MetaVar i -> MetaVar i 188 | Uni -> Uni 189 | Ap l r -> case reduce l of 190 | Lam body -> reduce (subst r 0 body) 191 | l' -> Ap l' (reduce r) 192 | Lam body -> Lam (reduce body) 193 | Pi tp body -> Pi (reduce tp) (reduce body) 194 | ``` 195 | 196 | The remaining utility funcitons are simply checks and manipulations 197 | that we will frequently need on terms. We have a function which checks 198 | whether a term is of the form `M e1 e2 e3 ...` for some metavariable 199 | `M`; such terms are said to be stuck. 200 | 201 | ``` haskell 202 | isStuck :: Term -> Bool 203 | isStuck MetaVar {} = True 204 | isStuck (Ap f _) = isStuck f 205 | isStuck _ = False 206 | ``` 207 | 208 | The remaining utility functions simply convert telescopes of 209 | applications, `f a1 a2 a3 ...`, into an function and a list of 210 | arguments, `(f, [a1 ... an])` and then we have a function to put 211 | things back again. 212 | 213 | ``` haskell 214 | peelApTelescope :: Term -> (Term, [Term]) 215 | peelApTelescope t = go t [] 216 | where go (Ap f r) rest = go f (r : rest) 217 | go t rest = (t, rest) 218 | 219 | applyApTelescope :: Term -> [Term] -> Term 220 | applyApTelescope = foldl' Ap 221 | ``` 222 | 223 | We are now in a position to turn to implementing the actual 224 | unification algorithm with all of our utilities in hand. 225 | 226 | ### The Unification Algorithm 227 | 228 | There are really only two key functions in implementing the 229 | unification algorithm. We can either take an existing constraint and 230 | simplify it, or take a constraint and produce a list of partial 231 | solutions, at least one of which is correct if the constraint is 232 | solvable. The first function is remarkably similar to the first-order 233 | case of unification, we essentially take a constraint and produce a 234 | set of constraints which are equivalent to the original one. For 235 | instance, if our constraint that we're trying to solve is 236 | 237 | ``` haskell 238 | FreeVar 0 `Ap` E === FreeVar 0 `Ap` E' 239 | ``` 240 | 241 | It's easy to see that we might as well solve constraint `E === E'` 242 | which is strictly simpler. This is what the function `simplify` 243 | does. It has the type 244 | 245 | ``` haskell 246 | simplify :: Constraint -> UnifyM (S.Set Constraint) 247 | ``` 248 | 249 | In order to work with generating fresh metavariables and (later) 250 | backtracking, we use the monad `UnifyM`. This is defined, as is 251 | `Constraint`, as a type synonym 252 | 253 | ``` haskell 254 | type UnifyM = LogicT (Gen Id) 255 | type Constraint = (Term, Term) 256 | ``` 257 | 258 | Here we are using the package 259 | [logict](https://hackage.haskell.org/package/logict) to provide 260 | backtracking. My tutorial of this package can be found 261 | [here](https://jozefg.bitbucket.io/posts/2014-07-10-reading-logict.html). We 262 | are also using a package a threw together a few years ago called 263 | [`monad-gen`](https://hackage.haskell.org/package/monad-gen), it just 264 | provides a simple monad for generating fresh values. The sort of thing 265 | that I always end up needing in compilers. Without further-ado, let's 266 | start going through the cases for `simplify`. Each one of which 267 | corresponds to a simplifying move we are allowed to make on a 268 | constraint, ordered in terms of priority. 269 | 270 | ``` haskell 271 | simplify (t1, t2) 272 | | t1 == t2 = return S.empty 273 | ``` 274 | 275 | We start out with a nice and simple case, if the two terms of the 276 | constraint are literally identical, we have no further goals. Next we 277 | have two cases integrating reduction. If either term is reducible at 278 | all we reduce it and try to simplify the remaining goals. 279 | 280 | ``` haskell 281 | | reduce t1 /= t1 = simplify (reduce t1, t2) 282 | | reduce t2 /= t2 = simplify (t1, reduce t2) 283 | ``` 284 | 285 | This is how we integrate the fact that our unification is modulo 286 | reduction (we allow two terms to unify if they reduce to the same 287 | thing). Next comes the cases that are a little more sophisticated and 288 | correspond more closely to our original motivating example. If our two 289 | terms are a several things applied to free variables, we know the 290 | following 291 | 292 | 1. The free variables have to be the same 293 | 2. All of the arguments must unify 294 | 295 | This is captured by the following branch of simplify. 296 | 297 | ``` haskell 298 | | (FreeVar i, cxt) <- peelApTelescope t1, 299 | (FreeVar j, cxt') <- peelApTelescope t2 = do 300 | guard (i == j && length cxt == length cxt') 301 | fold <$> mapM simplify (zip cxt cxt') 302 | ``` 303 | 304 | This code just codifies the procedure that we have informally sketched 305 | above. If we're trying to unify `A a1 ... an` and `B b1 ... bm` for 306 | two free variables `A` and `B` then we must have `A = B` and `n = m` 307 | since we have to find a solution that works for any `A` and any 308 | `B`. Finally, we then just need to unify `ai` with `bi`. The next two 309 | cases are congruence type rules. We basically just produce new 310 | constraints saying that `Lam e === Lam e'` if and only if `e === 311 | e'`. There is a small amount of bookkeeping done to make sure that 312 | free variables are correctly represented by a globally unique 313 | `FreeVar i`. The same thing is done for `Pi` except, since `Pi`s are 314 | annotated with a type we also add a constraint for these types as well. 315 | 316 | ``` haskell 317 | | Lam body1 <- t1, 318 | Lam body2 <- t2 = do 319 | v <- FreeVar <$> lift gen 320 | return $ S.singleton (subst v 0 body1, subst v 0 body2) 321 | | Pi tp1 body1 <- t1, 322 | Pi tp2 body2 <- t2 = do 323 | v <- FreeVar <$> lift gen 324 | return $ S.fromList 325 | [(subst v 0 body1, subst v 0 body2), 326 | (tp1, tp2)] 327 | ``` 328 | 329 | The final case is to decide whether or not the constraint is "stuck" 330 | on a metavariable, in which case we'll need to guess a solution for a 331 | metavariable or whether the constraint is just impossible. If neither 332 | constraint is stuck, we fail using `mzero` and if we're stuck then we 333 | just return the inputted constraint since we can make it no simpler. 334 | 335 | ``` haskell 336 | | otherwise = 337 | if isStuck t1 || isStuck t2 then return $ S.singleton (t1, t2) else mzero 338 | ``` 339 | 340 | Now we turn to the most complicated part of the algorithm, where we 341 | actual try and produce possible and partial solutions for our 342 | unification constraints. The basic idea is to work with constraints of 343 | the form 344 | 345 | ``` haskell 346 | M a1 a2 ... an = A b1 b2 ... bm 347 | ``` 348 | 349 | where `M` is a metavariable and `A` is a some term, probably a free 350 | variable. These are called flex-rigid equations because one half is 351 | flexible, a metavariable, while one half is rigid. The first part of 352 | this code is to extract the relevant pieces of information from the 353 | constraint. Therefore, the code roughly looks like 354 | 355 | ``` haskell 356 | tryFlexRigid :: Constraint -> [UnifyM [Subst]] 357 | tryFlexRigid (t1, t2) 358 | | (MetaVar i, cxt1) <- peelApTelescope t1, 359 | (stuckTerm, cxt2) <- peelApTelescope t2, 360 | not (i `S.member` metavars t2) = error "TODO" 361 | | (MetaVar i, cxt1) <- peelApTelescope t2, 362 | (stuckTerm, cxt2) <- peelApTelescope t1, 363 | not (i `S.member` metavars t1) = error "TODO" 364 | | otherwise = [] 365 | ``` 366 | 367 | This simply uses `peelApTelescope` to extract the 4 components `M`, 368 | `(a1 ... an)`, `A` and `(b1 ... bm)`. The resulting type is "morally" 369 | supposed to be `[Subst]` but for technical reasons we need to 370 | `[UnifyM [Subst]]` because we need to generate metavariables for the 371 | substitutions. There are exactly 2 forms that `M` may take 372 | 373 | - `M = λ x1. ... λ xn. xi (M1 x1 ... xn) ... (Mr x1 ... xn)` 374 | - `M = λ x1. ... λ xn. A (M1 x1 ... xn) ... (Mr x1 ... xn)` 375 | (if `A` is closed) 376 | 377 | These are the only two forms that `M` can take because if `M` is any 378 | other constant or free variable than it would immediately 379 | contradictory, `M` couldn't possibly unify with `A b1 ... bm` as we 380 | need it to. Therefore, `tryFlexRigid` will produce a list of such 381 | substitutions (mod effects) replacing `M` with both of these. Since we 382 | don't know how many subterms we must apply to `xi` or `A` this will be 383 | an infinitely long list. More on this complication will 384 | follow. Therefore, we can replace `error "TODO"` with 385 | 386 | ``` haskell 387 | type Subst = M.Map Id Term 388 | 389 | tryFlexRigid :: Constraint -> [UnifyM [Subst]] 390 | tryFlexRigid (t1, t2) 391 | | (MetaVar i, cxt1) <- peelApTelescope t1, 392 | (stuckTerm, cxt2) <- peelApTelescope t2, 393 | not (i `S.member` metavars t2) = proj (length cxt1) i stuckTerm 0 394 | | (MetaVar i, cxt1) <- peelApTelescope t2, 395 | (stuckTerm, cxt2) <- peelApTelescope t1, 396 | not (i `S.member` metavars t1) = proj (length cxt1) i stuckTerm 0 397 | | otherwise = [] 398 | ``` 399 | 400 | Here `proj` generates the list of substitutions. It's arguments are 401 | 402 | 1. The number of bound variables 403 | 2. The metavariable we're trying to find substitutions for 404 | 3. The term `A` that we may use to construct a substitution for `M` 405 | 4. The number of subterms to generate (this will be incremented in 406 | the recursive call) 407 | 408 | It's defined just as 409 | 410 | ``` haskell 411 | proj bvars mv f nargs = 412 | generateSubst bvars mv f nargs : proj bvars mv f (nargs + 1) 413 | ``` 414 | 415 | Now the work is done in the actual function 416 | `generateSubst :: Int -> Id -> Term -> Int -> UnifyM [Subst]`. We have 417 | already explained the behavior of `generateSubst`, it's just going to 418 | create all possible substitutions of the form described above. There 419 | is little more to say than to just show the code. 420 | 421 | ``` haskell 422 | generateSubst bvars mv f nargs = do 423 | let mkLam tm = foldr ($) tm (replicate bvars Lam) 424 | let saturateMV tm = foldl' Ap tm (map LocalVar [0..bvars - 1]) 425 | let mkSubst = M.singleton mv 426 | args <- map saturateMV . map MetaVar <$> replicateM nargs (lift gen) 427 | return [mkSubst . mkLam $ applyApTelescope t args 428 | | t <- map LocalVar [0..bvars - 1] ++ 429 | if isClosed f then [f] else []] 430 | ``` 431 | 432 | All that is left to do is to tie these two functions together in to 433 | try and produce a solution in general. One small caveat is that we 434 | need a few simple functions for working with substitutions. One to 435 | take a `Subst` and perform all the indicated replacements on a term 436 | and one to take two substitutions and perform a disjoint merge on 437 | them. 438 | 439 | ``` haskell 440 | manySubst :: Subst -> Term -> Term 441 | manySubst s t = M.foldrWithKey (\mv sol t -> substMV sol mv t) t s 442 | 443 | (<+>) :: Subst -> Subst -> Subst 444 | s1 <+> s2 | not (M.null (M.intersection s1 s2)) = error "Impossible" 445 | s1 <+> s2 = M.union (manySubst s1 <$> s2) s1 446 | ``` 447 | 448 | Now our main function, `unify` will take the current substitution and 449 | a set of constraints and produce a solution substitution and a set of 450 | flex-flex equations. These are equations of the form 451 | `M a1 ... an = M' b1 ... bn`. It is provable that so called flex-flex 452 | equations are always solvable (cf Huet's lemma) but solving them in a 453 | canonical way is impossible so we instead produce the solution "up to" 454 | flex-flex equations and let the user deal with the ambiguity however 455 | they choose. For example, such an equation in resulting from Agda's 456 | unification algorithm will produce the error "unresolved 457 | metavariables" because the metavariable is not canonically 458 | determined. Therefore, our main algorithm proceeds in the following 459 | steps 460 | 461 | 1. Apply the given substitution to all our constraints. 462 | 2. Simplify the set of constraints to remove any obvious ones. 463 | 3. Separate flex-flex equations from flex-rigid ones. 464 | 4. Pick a flex-rigid equation at random, if there are none, we're 465 | done. 466 | 5. Use `tryFlexRigid` to get a list of possible solutions 467 | 6. Try each solution and attempt to unify the remaining constraints, 468 | backtracking if we get stuck 469 | 470 | In order to implement 2, we define a function which is simply the 471 | "closure" of `simplify` and applies it until there is no more 472 | simplification to be done. 473 | 474 | ``` haskell 475 | repeatedlySimplify :: S.Set Constraint -> UnifyM (S.Set Constraint) 476 | repeatedlySimplify cs = do 477 | cs' <- fold <$> traverse simplify (S.toList cs) 478 | if cs' == cs then return cs else repeatedlySimplify cs' 479 | ``` 480 | 481 | Apart from this, the main routine four unification is quite 482 | declarative 483 | 484 | ``` haskell 485 | unify :: Subst -> S.Set Constraint -> UnifyM (Subst, S.Set Constraint) 486 | unify s cs = do 487 | let cs' = applySubst s cs 488 | cs'' <- repeatedlySimplify cs' 489 | let (flexflexes, flexrigids) = S.partition flexflex cs'' 490 | if S.null flexrigids 491 | then return (s, flexflexes) 492 | else do 493 | let psubsts = tryFlexRigid (S.findMax flexrigids) 494 | trySubsts psubsts (flexrigids <> flexflexes) 495 | ``` 496 | 497 | The first line implements step 1, using 498 | `applySubst :: Subst -> S.Set Constraint -> S.Set Constraint` to apply 499 | our substitution. The next line simplifies the constraints so we're 500 | left with flex-flex or flex-rigid constraints. After this, we can 501 | partition the constraints into these two classes. From here, we simply 502 | implement steps 4-6 making use of the helper function `trySubst` 503 | 504 | ``` haskell 505 | trySubsts :: [UnifyM [Subst]] -> S.Set Constraint -> UnifyM (Subst,S.Set Constraint) 506 | ``` 507 | 508 | This function takes care of peeling out each substitution and applying 509 | it to the constraints we have lying around. In order to cope with the 510 | fact that all of these are potentially infinite and we need to fairly 511 | search the resulting space, we make use of 512 | `interleave :: m a -> m a -> m a` from `logict`. It's essentially 513 | equivalent to `mplus` from the list monad but search fairly in the 514 | case of infinite lists. This takes care of handling backtracking in a 515 | seamless and mostly invisible way, Haskell is fun sometimes! The code 516 | for implementing this is essentially just `interleave`-ing all the 517 | recursive calls to `unify` that we need to make using `mzero`, 518 | failure, for when we've run out of substitutions to try. 519 | 520 | ``` haskell 521 | trySubsts [] cs = mzero 522 | trySubsts (mss : psubsts) cs = do 523 | ss <- mss 524 | let these = foldr interleave mzero [unify (newS <+> s) cs | newS <- ss] 525 | let those = trySubsts psubsts cs 526 | these `interleave` those 527 | ``` 528 | 529 | Putting all of this code together, we have completed a higher-order 530 | unificaiton algorithm! To make a top-level function to play with, we 531 | add a driver function which runs `unify` and strips out all of the 532 | monads of `UnifyM` 533 | 534 | ``` haskell 535 | driver :: Constraint -> Maybe (Subst, S.Set Constraint) 536 | driver = listToMaybe . runGenFrom 100 . observeAllT . unify M.empty . S.singleton 537 | ``` 538 | -------------------------------------------------------------------------------- /higher-order-unification.cabal: -------------------------------------------------------------------------------- 1 | name: higher-order-unification 2 | version: 0.1.0.0 3 | synopsis: A simple implementation of higher order unification 4 | license: MIT 5 | license-file: LICENSE 6 | author: Danny Gratzer 7 | maintainer: jozefg@cmu.edu 8 | category: Language 9 | build-type: Simple 10 | extra-source-files: README.md 11 | cabal-version: >=1.10 12 | 13 | 14 | library 15 | exposed-modules: Unification 16 | , Client 17 | build-depends: base >=4 && <5 18 | , containers 19 | , logict 20 | , monad-gen 21 | , mtl 22 | hs-source-dirs: src 23 | default-language: Haskell2010 24 | -------------------------------------------------------------------------------- /src/Client.hs: -------------------------------------------------------------------------------- 1 | module Client (infer) where 2 | import Control.Monad 3 | import Control.Monad.Gen 4 | import Control.Monad.Trans 5 | import qualified Data.Map as M 6 | import Data.Maybe 7 | import Data.Monoid 8 | import qualified Data.Set as S 9 | import Unification 10 | 11 | typeOf :: M.Map Id Term 12 | -> M.Map Id Term 13 | -> Term 14 | -> UnifyM (Term, S.Set Constraint) 15 | typeOf mcxt cxt t = case t of 16 | LocalVar i -> mzero 17 | FreeVar i -> maybe mzero (\x -> return (x, S.empty)) $ M.lookup i cxt 18 | MetaVar i -> maybe mzero (\x -> return (x, S.empty)) $ M.lookup i mcxt 19 | Uni -> return (Uni, S.empty) 20 | Ap l r -> do 21 | (tpl, cl) <- typeOf mcxt cxt l 22 | (tpr, cr) <- typeOf mcxt cxt r 23 | case tpl of 24 | Pi from to -> return (subst r 0 to, cl <> cr <> S.singleton (from, tpr)) 25 | _ -> do 26 | (m1, m2) <- (,) <$> lift gen <*> lift gen 27 | return ( MetaVar m2 `Ap` r 28 | , cl <> cr <> 29 | S.fromList [ (tpl, Pi (MetaVar m1) (MetaVar m2 `Ap` LocalVar 0)) 30 | , (tpr, MetaVar m1) ]) 31 | Lam b -> do 32 | (v, m) <- (,) <$> lift gen <*> lift gen 33 | (to, cs) <- 34 | typeOf (M.insert m Uni mcxt) (M.insert v (MetaVar m) cxt) 35 | (subst (FreeVar v) 0 b) 36 | return ( Pi (MetaVar m) (substFV (LocalVar 0) v (raise 1 to)) 37 | , cs <> S.singleton (MetaVar m, MetaVar m)) 38 | Pi from to -> do 39 | v <- lift gen 40 | (fromTp, fromCs) <- typeOf mcxt cxt from 41 | (toTp, toCs) <- typeOf mcxt (M.insert v from cxt) (subst (FreeVar v) 0 to) 42 | return (Uni, fromCs <> toCs <> S.fromList [(Uni, fromTp), (Uni, toTp)]) 43 | 44 | infer :: Term -> Maybe (Term, S.Set Constraint) 45 | infer t = listToMaybe . runUnifyM $ go 46 | where go = do 47 | (tp, cs) <- typeOf M.empty M.empty t 48 | (subst, flexflex) <- unify M.empty cs 49 | return (manySubst subst tp, flexflex) 50 | -------------------------------------------------------------------------------- /src/Unification.hs: -------------------------------------------------------------------------------- 1 | {-# LANGUAGE FlexibleContexts #-} 2 | module Unification 3 | ( Id(..) 4 | , Index(..) 5 | , Term(..) 6 | , raise 7 | , subst 8 | , substMV 9 | , manySubst 10 | , substFV 11 | , UnifyM (..) 12 | , Constraint(..) 13 | , Subst (..) 14 | , unify 15 | , runUnifyM 16 | , driver) where 17 | import Control.Monad 18 | import Control.Monad.Gen 19 | import Control.Monad.Logic 20 | import Control.Monad.Trans 21 | import qualified Data.Map.Strict as M 22 | import Data.Maybe 23 | import Data.Foldable 24 | import Data.Monoid 25 | import qualified Data.Set as S 26 | 27 | -------------------------------------------------- 28 | ------------------ the language ------------------ 29 | -------------------------------------------------- 30 | 31 | type Id = Int 32 | type Index = Int 33 | data Term = FreeVar Id 34 | | LocalVar Index 35 | | MetaVar Id 36 | | Uni 37 | | Ap Term Term 38 | | Lam Term 39 | | Pi Term Term 40 | deriving (Eq, Show, Ord) 41 | 42 | -- | Raise @LocalVar@s without a binding by @i@ amount. Used to avoid 43 | -- capture in terms with free de Bruijn variables. 44 | raise :: Int -> Term -> Term 45 | raise = go 0 46 | where go lower i t = case t of 47 | FreeVar i -> FreeVar i 48 | LocalVar j -> if i > lower then LocalVar (i + j) else LocalVar j 49 | MetaVar i -> MetaVar i 50 | Uni -> Uni 51 | Ap l r -> go lower i l `Ap` go lower i r 52 | Lam body -> Lam (go (lower + 1) i body) 53 | Pi tp body -> Pi (go lower i tp) (go (lower + 1) i body) 54 | 55 | -- | Substitute a term for the de Bruijn variable @i@. 56 | subst :: Term -> Int -> Term -> Term 57 | subst new i t = case t of 58 | FreeVar i -> FreeVar i 59 | LocalVar j -> case compare j i of 60 | LT -> LocalVar j 61 | EQ -> new 62 | GT -> LocalVar (j - 1) 63 | MetaVar i -> MetaVar i 64 | Uni -> Uni 65 | Ap l r -> subst new i l `Ap` subst new i r 66 | Lam body -> Lam (subst (raise 1 new) (i + 1) body) 67 | Pi tp body -> Pi (subst new i tp) (subst (raise 1 new) (i + 1) body) 68 | 69 | -- | Substitute a term for all metavariables with a given identifier. 70 | substMV :: Term -> Id -> Term -> Term 71 | substMV new i t = case t of 72 | FreeVar i -> FreeVar i 73 | LocalVar i -> LocalVar i 74 | MetaVar j -> if i == j then new else MetaVar j 75 | Uni -> Uni 76 | Ap l r -> substMV new i l `Ap` substMV new i r 77 | Lam body -> Lam (substMV (raise 1 new) i body) 78 | Pi tp body -> Pi (substMV new i tp) (substMV (raise 1 new) i body) 79 | 80 | -- | Substitute a term for all free variable with a given identifier. 81 | substFV :: Term -> Id -> Term -> Term 82 | substFV new i t = case t of 83 | FreeVar j -> if i == j then new else FreeVar j 84 | MetaVar i -> MetaVar i 85 | LocalVar i -> LocalVar i 86 | Uni -> Uni 87 | Ap l r -> substFV new i l `Ap` substFV new i r 88 | Lam body -> Lam (substFV (raise 1 new) i body) 89 | Pi tp body -> Pi (substFV new i tp) (substFV (raise 1 new) i body) 90 | 91 | -- | Gather all the metavariables in a term into a set. 92 | metavars :: Term -> S.Set Id 93 | metavars t = case t of 94 | FreeVar i -> S.empty 95 | LocalVar i -> S.empty 96 | MetaVar j -> S.singleton j 97 | Uni -> S.empty 98 | Ap l r -> metavars l <> metavars r 99 | Lam body -> metavars body 100 | Pi tp body -> metavars tp <> metavars body 101 | 102 | -- | Returns @True@ if a term has no free variables and is therefore a 103 | -- valid candidate for a solution to a metavariable. 104 | isClosed :: Term -> Bool 105 | isClosed t = case t of 106 | FreeVar i -> False 107 | LocalVar i -> True 108 | MetaVar j -> True 109 | Uni -> True 110 | Ap l r -> isClosed l && isClosed r 111 | Lam body -> isClosed body 112 | Pi tp body -> isClosed tp && isClosed body 113 | 114 | -- | Implement reduction for the language. Given a term, normalize it. 115 | -- This boils down mainly to applying lambdas to their arguments and all 116 | -- the appropriate congruence rules. 117 | reduce :: Term -> Term 118 | reduce t = case t of 119 | FreeVar i -> FreeVar i 120 | LocalVar j -> LocalVar j 121 | MetaVar i -> MetaVar i 122 | Uni -> Uni 123 | Ap l r -> case reduce l of 124 | Lam body -> reduce (subst r 0 body) 125 | l' -> Ap l' (reduce r) 126 | Lam body -> Lam (reduce body) 127 | Pi tp body -> Pi (reduce tp) (reduce body) 128 | 129 | -- | Check to see if a term is blocked on applying a metavariable. 130 | isStuck :: Term -> Bool 131 | isStuck MetaVar {} = True 132 | isStuck (Ap f _) = isStuck f 133 | isStuck _ = False 134 | 135 | -- | Turn @f a1 a2 a3 a4 ... an@ to @(f, [a1...an])@. 136 | peelApTelescope :: Term -> (Term, [Term]) 137 | peelApTelescope t = go t [] 138 | where go (Ap f r) rest = go f (r : rest) 139 | go t rest = (t, rest) 140 | 141 | -- | Turn @(f, [a1...an])@ into @f a1 a2 a3 a4 ... an@. 142 | applyApTelescope :: Term -> [Term] -> Term 143 | applyApTelescope = foldl' Ap 144 | 145 | ----------------------------------------------------------------- 146 | -------------- the actual unification code ---------------------- 147 | ----------------------------------------------------------------- 148 | 149 | type UnifyM = LogicT (Gen Id) 150 | type Constraint = (Term, Term) 151 | 152 | -- | Given a constraint, produce a collection of equivalent but 153 | -- simpler constraints. Any solution for the returned set of 154 | -- constraints should be a solution for the original constraint. 155 | simplify :: Constraint -> UnifyM (S.Set Constraint) 156 | simplify (t1, t2) 157 | | t1 == t2 && S.null (metavars t1) = return S.empty 158 | | reduce t1 /= t1 = simplify (reduce t1, t2) 159 | | reduce t2 /= t2 = simplify (t1, reduce t2) 160 | | (FreeVar i, cxt) <- peelApTelescope t1, 161 | (FreeVar j, cxt') <- peelApTelescope t2 = do 162 | guard (i == j && length cxt == length cxt') 163 | fold <$> mapM simplify (zip cxt cxt') 164 | | Lam body1 <- t1, 165 | Lam body2 <- t2 = do 166 | v <- FreeVar <$> lift gen 167 | return $ S.singleton (subst v 0 body1, subst v 0 body2) 168 | | Pi tp1 body1 <- t1, 169 | Pi tp2 body2 <- t2 = do 170 | v <- FreeVar <$> lift gen 171 | return $ S.fromList 172 | [(subst v 0 body1, subst v 0 body2), 173 | (tp1, tp2)] 174 | | otherwise = 175 | if isStuck t1 || isStuck t2 then return $ S.singleton (t1, t2) else mzero 176 | 177 | type Subst = M.Map Id Term 178 | 179 | -- | Generate all possible solutions to flex-rigid equations as an 180 | -- infinite list of computations producing finite lists. 181 | tryFlexRigid :: Constraint -> [UnifyM [Subst]] 182 | tryFlexRigid (t1, t2) 183 | | (MetaVar i, cxt1) <- peelApTelescope t1, 184 | (stuckTerm, cxt2) <- peelApTelescope t2, 185 | not (i `S.member` metavars t2) = proj (length cxt1) i stuckTerm 0 186 | | (MetaVar i, cxt1) <- peelApTelescope t2, 187 | (stuckTerm, cxt2) <- peelApTelescope t1, 188 | not (i `S.member` metavars t1) = proj (length cxt1) i stuckTerm 0 189 | | otherwise = [] 190 | where proj bvars mv f nargs = 191 | generateSubst bvars mv f nargs : proj bvars mv f (nargs + 1) 192 | generateSubst bvars mv f nargs = do 193 | let mkLam tm = foldr ($) tm (replicate bvars Lam) 194 | let saturateMV tm = foldl' Ap tm (map LocalVar [0..bvars - 1]) 195 | let mkSubst = M.singleton mv 196 | args <- map saturateMV . map MetaVar <$> replicateM nargs (lift gen) 197 | return [mkSubst . mkLam $ applyApTelescope t args 198 | | t <- map LocalVar [0..bvars - 1] ++ 199 | if isClosed f then [f] else []] 200 | 201 | -- | The reflexive transitive closure of @simplify@ 202 | repeatedlySimplify :: S.Set Constraint -> UnifyM (S.Set Constraint) 203 | repeatedlySimplify cs = do 204 | cs' <- fold <$> traverse simplify (S.toList cs) 205 | if cs' == cs then return cs else repeatedlySimplify cs' 206 | 207 | manySubst :: Subst -> Term -> Term 208 | manySubst s t = M.foldrWithKey (\mv sol t -> substMV sol mv t) t s 209 | 210 | (<+>) :: Subst -> Subst -> Subst 211 | s1 <+> s2 | not (M.null (M.intersection s1 s2)) = error "Impossible" 212 | s1 <+> s2 = M.union (manySubst s1 <$> s2) s1 213 | 214 | -- | The top level function, given a substitution and a set of 215 | -- constraints, produce a solution substution and the resulting set of 216 | -- flex-flex equations. 217 | unify :: Subst -> S.Set Constraint -> UnifyM (Subst, S.Set Constraint) 218 | unify s cs = do 219 | let cs' = applySubst s cs 220 | cs'' <- repeatedlySimplify cs' 221 | let (flexflexes, flexrigids) = S.partition flexflex cs'' 222 | if S.null flexrigids 223 | then return (s, flexflexes) 224 | else do 225 | let psubsts = tryFlexRigid (S.findMax flexrigids) 226 | trySubsts psubsts (flexrigids <> flexflexes) 227 | where applySubst s = S.map (\(t1, t2) -> (manySubst s t1, manySubst s t2)) 228 | flexflex (t1, t2) = isStuck t1 && isStuck t2 229 | trySubsts [] cs = mzero 230 | trySubsts (mss : psubsts) cs = do 231 | ss <- mss 232 | let these = foldr interleave mzero [unify (newS <+> s) cs | newS <- ss] 233 | let those = trySubsts psubsts cs 234 | these `interleave` those 235 | 236 | runUnifyM :: UnifyM a -> [a] 237 | runUnifyM = runGenFrom 100 . observeAllT 238 | 239 | -- | Solve a constraint and return the remaining flex-flex constraints 240 | -- and the substitution for it. 241 | driver :: Constraint -> Maybe (Subst, S.Set Constraint) 242 | driver = listToMaybe . runUnifyM . unify M.empty . S.singleton 243 | -------------------------------------------------------------------------------- /stack.yaml: -------------------------------------------------------------------------------- 1 | flags: {} 2 | packages: 3 | - '.' 4 | extra-deps: [monad-gen-0.3.0.1] 5 | resolver: lts-9.0 6 | --------------------------------------------------------------------------------