└── README.md /README.md: -------------------------------------------------------------------------------- 1 | # plt-errata 2 | Collection of errata for the book _[Implementing Programming Languages](http://www1.digitalgrammars.com/ipl-book)_ by Aarne Ranta. 3 | 4 | To add a new erratum, create an issue or pull request. 5 | Please use Github Markdown syntax and adhere to the style of this page. 6 | I will then add the erratum below. 7 | 8 | Errata reported by: 9 | - Andreas Abel 10 | - Shahnur Isgandarli 11 | - Andreas Lööw 12 | - Nachiappan Valliappan 13 | - WASDi 14 | - Alexander Kurz [#11] 15 | - csoroz [#12] [#13] 16 | - Tobias Hägglund 17 | 18 | ## Known errata 19 | 20 | This includes the errata listed on the book website. 21 | 22 | ### Chapter 1, Compilation Phases 23 | 24 | p. 10 (and also later): it is stated that Python is an untyped language. By this we mean that Python has no compile-time type checking. But it does have a run-time notion of types, known as dynamic typing. 25 | 26 | ### Chapter 2, Grammars 27 | 28 | p. 17: bnfc backend `-java1.4` no longer exists. 29 | 30 | p. 18f: `-java` backend: From bnfc 2.8.1, the CUP parser file is named 31 | `_cup.cup`. From 2.8.4, the generated directory is lowercase: `calc` 32 | instead of `Calc`. 33 | 34 | p. 25: last line: `show (interpret e)` should be `show (eval e)`. 35 | 36 | p. 27: too many classes in the Java example have the name `EAdd`. Should be `EAdd`, `ESub`, `EMul`, `EDiv`. 37 | 38 | p. 34: The two LBNF rules 39 | ``` 40 | SDecl. Stm ::= Type Id ";" ; 41 | SDecls. Stm ::= Type Id "," [Id] ";" ; 42 | ``` 43 | can and should be simplified to 44 | ``` 45 | SDecls. Stm ::= Type [Id] ";" ; 46 | ``` 47 | where 48 | ``` 49 | separator nonempty Id "," ; 50 | ``` 51 | 52 | p. 36: The rule for assignment expressions 53 | ``` 54 | EAss. Exp2 ::= Exp3 "=" Exp2; 55 | ``` 56 | does not match the form `v=e` given in the table on p. 35. 57 | It should be simplified to: 58 | 59 | ``` 60 | EAss. Exp2 ::= Id "=" Exp2; 61 | ``` 62 | 63 | 64 | ### Chapter 3, Lexing and Parsing 65 | 66 | p. 41: The Empty construction could be simplified, by making the 67 | initial state final and saving the ε-transition. 68 | 69 | p. 41: The Sequence construction can be simplified by using the 70 | initial state of the first automaton as initial state of the sequence 71 | and the final state of the second automaton as the final state of the 72 | sequence. This saves 2 ε-transitions (and would correspond to the 73 | example on p. 43). 74 | 75 | p. 43: This NFA is not generated by the algorithm on p. 41. It misses 76 | ε-transitions. On both paths, there should be 5 ε-transitions, as 2 77 | are generated by the Union and 3 by the Sequence construction. 78 | 79 | p. 43: The result of the subset construction should have `0,1,5` as 80 | initial state instead of just `0`. Again on p. 44. 81 | 82 | p. 46: The figure should say `m b's` and `n b's` on the arcs going to 83 | the final states (instead of `m a's` and `n a's`). 84 | 85 | p. 53: The parse table is missing the goto actions (essential!). 86 | 87 | p. 53: The line with `%start_pExp` should be deleted from the table (or explained). 88 | It is rather confusing than helpful. 89 | 90 | ### Chapter 4, Type Checking 91 | 92 | p. 64 Exercise 4-0: "`1 + 2 + "hello" + 1 + 2` ... which type of `+` 93 | applies to each of the four additions. Recall that `+` is 94 | left associative!" 95 | 96 | A disclaimer should be added here that a programming language would be 97 | ill-designed that allowed the value of an expression depend on an 98 | arbitrary choice of association order of a usually associative 99 | operator like `+`. (In this case the value, is either `"3hello12"` or 100 | `"12hello3"` depending on the association order.) A sane language 101 | would guarantee the usual law _(x + y) + z = x + (y + z)_ to the 102 | extend possible. (Note that for floating points it holds only 103 | approximately!) 104 | 105 | 106 | #### 4.7 The validity of statements and function definitions 107 | 108 | The judgement for checking statements should be formulated as 109 | 110 | ``` 111 | Γ ⊢ s ⇒ Γ' 112 | ``` 113 | 114 | Declarations such as `int x;` extend the typing context. 115 | This would allow to define checking of a sequence of statements 116 | in the natural way. Actually the Haskell implementation in 4.11 117 | does it exactly like I suggest here. 118 | 119 | 4.9 would have to be rewritten. What is going on in 4.9 currently 120 | is that the state monad formulation `Γ ⊢ s ⇒ Γ'` is replaced by a 121 | context monad formulation `Γ ⊢ ss valid` at the cost of modularity: 122 | we can only handle statement sequences. 123 | 124 | #### 4.8 Declarations and block structures 125 | 126 | This section should discuss the scopes for `if` and `while` (see 127 | errata for Section 5.3). 128 | 129 | #### 4.9 Implementing a type checker 130 | 131 | p. 68: missing right closing paranthesis in the code snippet for a single function type checking. 132 | 133 | #### 4.10 Annotating type checkers 134 | 135 | p. 69: the pseudo-code for `infer(Γ,a+b)` is wrong in that it removes the annotations from the subexpressions of the addition expression. The correct return statement would be 136 | 137 | ```haskell 138 | return [['a : t] + ['b : t] : t] 139 | ``` 140 | 141 | p. 70: the pseudo-code for `infer(Γ,a+b)` has the same problem as on p. 69 142 | 143 | #### 4.11 Type checker in Haskell 144 | 145 | p. 72-73: the implementation of `inferBin` lacks a final `return typ`. 146 | 147 | p. 73: in `checkExp` code, `if (typ2 = typ)` should be `if typ2 == typ`. 148 | It could also be written as 149 | 150 | ```haskell 151 | unless (typ2 == typ) $ fail $ 152 | "type of " ++ ... 153 | ``` 154 | 155 | In `checkStm`, there are several errors. The correct code is: 156 | 157 | ```haskell 158 | checkStm :: Env -> Stm -> Err Env 159 | checkStm env s = case s of 160 | SExp exp -> do 161 | inferExp env exp 162 | return env 163 | SDecl typ x -> 164 | updateVar env x typ 165 | SWhile exp stm -> do 166 | checkExp env Type_bool exp 167 | checkStm (newBlock env) stm 168 | return env 169 | ``` 170 | 171 | p. 74, lines 1-3: Use `s` instead of `x` as variable name for statement. 172 | 173 | p. 75-76: Use `Void` instead of `Object` as the generic type parameter for `arg`. Change affects several code snippets and the text that describes it. 174 | 175 | p. 78: It is bad practice to catch `Throwable`, as it includes fatal JVM errors like `OutOfMemoryError` and `StackOverflowError`. To catch all checked exceptions, `catch (Exception e)` should be used. 176 | 177 | ### Chapter 5, Interpreters 178 | 179 | p. 82: Rule `γ ⊢ x ⇓ v`: The `v` is type-set in the wrong font. 180 | 181 | #### 5.3 Statements 182 | 183 | It is not clear how the statement interpreter `γ ⊢ s ⇓ γ'` would deal 184 | with `return` statements which are not at the end of the function (or 185 | `break` statements in while loops). I suggest to change it to 186 | 187 | ``` 188 | γ ⊢ s ⇓ ⟨r,γ'⟩ 189 | ``` 190 | 191 | where `r ::= continue | return v` is the result of the statement: 192 | usually `continue`, but `return v` for a return statement. The 193 | execution of sequencing of statements will discard the rest of the 194 | statements once the result is `return v`. 195 | 196 | p. 84: The specified interpreter gives the wrong result for 197 | 198 | ```c 199 | int x = 0; 200 | int y = 0; 201 | while (x++ < 1) int y = 1; 202 | return y; 203 | ``` 204 | 205 | It gives `1`, while the correct result is `0`. 206 | The problem is that the body of the `while` will overwrite the value of the shadowed `y`. 207 | To fix this, the body of a `while` has to be treated as if in its own block. 208 | 209 | See http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf: 210 | 211 | > An iteration statement is a block whose scope is a strict subset of 212 | > the scope of its enclosing block. The loop body is also a block 213 | > whose scope is a strict subset of the scope of the iteration 214 | > statement. (Section 6.8.5, sentence 5, page number 135, absolute page 147) 215 | 216 | Possible fix: replace premise `γ′ ⊢ s ⇓ γ″` by `γ′. ⊢ s ⇓ γ″.γ₀` in the 217 | first rule for `while`. 218 | 219 | `if` has to be fixed in a similar way, see section 6.8.4, sentence 3. 220 | 221 | ```c 222 | int y = 0; 223 | if (1) int y = 1; else int y = 2; 224 | return y; 225 | ``` 226 | 227 | This should return `0`, but the current interpreter will return `1`. 228 | 229 | Possible fix: replace premise `γ′ ⊢ s ⇓ γ″` by `γ′. ⊢ s ⇓ γ″.γ₀` in the 230 | first rule for `if`. Analogously for the second rule. 231 | 232 | #### 5.4 Programs, function definitions, and function calls 233 | 234 | p. 86, rule for function call: The subscript `n` in argument `a_n` should be `m`. [#11] 235 | 236 | #### 5.7 Interpreting Java bytecode 237 | 238 | p. 92: in the last rule for `ifeq L`, the code pointer should become `P+1` when `v != 0` 239 | 240 | ### Chapter 6, Code Generation 241 | 242 | p. 103: "The last case takes case of ..." → "takes care of" 243 | 244 | p. 108 l. 5: _funtypeJVM(...)_ has one extra parenthesis at the end. 245 | 246 | p. 108: there is a `<` in the class file template which should 247 | just be `<`. 248 | 249 | p. 112: in code snippet of `CodeGenerator` class, `compile(Program p)` method requires missing method body, or abstract method declaration. 250 | 251 | p. 112: "just a dummy `Object`". Now, Java has class `Void` for that 252 | purpose. Change would affect the following code. 253 | 254 | p. 105: The generated code for the while loop in the middle column contains `ifeq goto END`. It should be `ifeq END` without the goto. 255 | 256 | ### Chapter 7, Functional Programming Languages 257 | 258 | #### 7.3 Anonymous functions 259 | 260 | p. 128: C++ has had lambda functions since C++11. This is also true for many other mainstream imperative languages nowadays, such as Java and C# -- so the comment about imperative languages on the previous page is a little misleading. 261 | 262 | #### 7.5 Call by value vs. call by name 263 | 264 | p. 132: In other places, the notation "λx.e" is used for anonymous functions, but in the call-by-name application rule the notation "λx → e" is used instead. 265 | 266 | p. 134: It is stated that only the application rule differs between call-by-value and call-by-name. This is false. The variable rule, at least as stated in the book on p. 130 needs to change as well. The book’s variable rule gives back the value found in the environment for the variable – but this only works for call-by-value, for call-by-name to-be-evaluated expressions are stored in the environment rather than values. 267 | 268 | The organization of the environment concerning global and local bindings needs more discussion: 269 | 270 | > p. 134: It is stated that two environments (i.e. separating functions and variables) are _needed_ to handle recursive definitions: Apparently the author forgot that you can have circular data structures (implemented either straightforwardly in lazy languages like Haskell, or by mutation+references in e.g. Java). (Side note: Another motivation mentioned is that functions and variables should be separated to save memory, but if memory is saved or not depends on how the environment is represented (consider e.g. linked lists vs. tree-based map data structures).) 271 | 272 | The separation of global and local bindings is not necessary, but conceptually cleaner, since global functions can be bound to expression, whereas local bindings need in general be bound to closures. 273 | 274 | #### 7.8 Polymorphism 275 | 276 | p. 139 (and other places, e.g. Sec. 7.9): "->" is typeset incorrectly, e.g. in "since `a = d -> e`" (compare with arrow below). 277 | 278 | #### 7.9 Polymorphic type checking with unification 279 | 280 | p. 142: in `infer(f,a)`: before `infer(a)`, substitution `γ₁` has to 281 | be applied to the typing context. 282 | 283 | ### Chapter 8, The Language Design Space 284 | 285 | p. 157: BNFC was not ported to Java, C, C++, etc.; rather, the mentioned languages were added as supported backends. 286 | 287 | p. 170: the `lin` rules for `TAll` and `TAny` generate a bogus condition. The proper rules are: 288 | 289 | ``` 290 | TAll kind = parenth ("\\p -> and [p x | x <-" ++ kind ++ "]") ; 291 | TAny kind = parenth ("\\p -> or [p x | x <-" ++ kind ++ "]") ; 292 | ``` 293 | 294 | ### Appendix A 295 | 296 | p. 175: The arcs in this diagram are not really traceable. 297 | 298 | ### Appendix B 299 | 300 | p. 193: dcmpl explanation should be "compare if >*" 301 | 302 | p. 194: change the description of "dcmp, dcmpl" to 303 | 304 | * dcmpg, dcmpl: takes and compares the two topmost doubles on the stack. 305 | The value left on the stack is 1 if the first is greater than the second, 306 | 0 if they are equal, and -1 if the first is smaller than the second. 307 | The operations only differ if one of the doubles is NaN (not-a-number). 308 | Then dcmpg leaves 1, and dcmpl -1. 309 | --------------------------------------------------------------------------------