└── README.md /README.md: -------------------------------------------------------------------------------- 1 | # A Stateful MIR for Rust 2 | 3 | I here propose an enrichment of Rust's core language (the MIR) to support more advanced features. 4 | The list of features I use or enable in the surface language is broad, and should include: 5 | 6 | - "First-class "initializedness" i.e. making the type system aware of initialized locations (or even values!) 7 | - Typestate 8 | - [`&move` and friends](https://github.com/rust-lang/rfcs/issues/998) 9 | - Moving out of `&mut` temporarily [isn't there an issue for this?] 10 | - Enum variant subtyping 11 | - Safe `DerefMove` 12 | - [Placer API](https://github.com/rust-lang/rfcs/pull/809) 13 | - Partial Moves 14 | - [Non lexical lifetime](http://smallcultfollowing.com/babysteps/blog/2016/05/04/non-lexical-lifetimes-based-on-liveness/)'s "fragments" 15 | - Drop 16 | - [By-value / by `&move` Drop](https://github.com/rust-lang/rfcs/pull/1180) 17 | - [Cyclic drop](https://github.com/rust-lang/rfcs/pull/1327) 18 | - Cyclic init 19 | - [Linear types](https://github.com/rust-lang/rfcs/pull/776) 20 | - Dynamism 21 | - Enums with tags in a separate location 22 | - Desugaring [non-zeroing dynamic drop](https://github.com/rust-lang/rfcs/pull/320) 23 | - [Non-lexical lifetimes](http://smallcultfollowing.com/babysteps/blog/2016/05/09/non-lexical-lifetimes-adding-the-outlives-relation/)'s "outlives at" 24 | 25 | Most previous discussion (what I've linked) focuses on the surface language, backwards compatibility, and the like. 26 | Where the discussion gets stuck is it's not clear what hidden complexity (especially soundness) may be overlooked. 27 | I've instead opted for the route of defining a core language rigorously enough so that readers might convince themselves (or prove!) that the core language is sound and both today's Rust and the proposed extensions can be desugared to it. 28 | 29 | The core idea here is to bring back typestate---in this presentation the ability for locations to change type. 30 | With that, many nice abilities arise naturally. 31 | 32 | 33 | ## First-class "Initializedness" 34 | 35 | Rust already has limited support for creating locations and initializing them later---see `let x;`. 36 | The problem with its current support is that it's anti-modular: whatever sort of flow analysis pass implements this just looks at the body of a single function and does not allow any operation on the uninitialized local variable. 37 | The solution is two-fold: first, being able to infer "initializedness"---what I am calling the status of whether something is initialized or not---from types alone, and second, allowing locations to change type. 38 | 39 | The need for more types is simple enough to understand. 40 | From a tool writer's perspective, types are *the* way to divide-and-conquer or suspend-and-resume some sort of global static analysis: a little local analysis is done, code/data is tagged with what was proved, and that information can can now be used remotely. 41 | 42 | One question is whether lvalues/locations and rvalues/values both need to know about initializedness: on one hand one only writes to or takes references to lvalues, and currently reading uninitialized locations is prohibited. 43 | On the other hand, the type system doesn't currently discriminate between rvalues and lvalues, and in the interest of simplicity it would be nice to keep it that way. 44 | Here's my tiebreaker: functional struct update (`{ foo: bar, ..baz }`) both could benefit from initializedness (field `foo` need not be initialized in `baz`) and only concerns rvalues. 45 | (While a valid desugaring might introduce some temporaries and assign them, this is not the only approach.) 46 | Therefore I opt to include initializedness to types for rvalues and lvalues alike. 47 | Without any way to inspect an undefined value, this should be perfectly safe. 48 | 49 | With that alone, however, an uninitialized location can never be used for anything, and an initialized location must stay initialized---we are worse off than before! 50 | But, if we allow a location to change its type (as long as size and alignment is preserved), we get back where we started, and more! 51 | So moving out would change the type of a location from `initialized T` to `uninitialized T`, and the opposite for writes. 52 | ("Dropping then writing" is best though of as two separate steps, but can still look like one step only in the surface language. 53 | More on this and dropping in general later.) 54 | In fact, we could even allow changing from `T` to `U` if they have the same size! 55 | Likewise, we only need one uninitialized type per size. 56 | That said, it's possible to stage all this in the core language or surface language, just starting with stateful initializedness for locations. 57 | 58 | Let's now start to formalize our new MIR, limiting ourselves to what features have been introduced so far. 59 | 60 | For convenience, 61 | ``` 62 | anything* ::= | anything ',' anything* 63 | ``` 64 | aka comma-separated repetition, 0 or more times. 65 | As these grammars are really "language agnostic ADTs" and not here for parsing, I may play fast and loose with trailing commas and the like. 66 | 67 | lvalues are global or whole-function scoped: 68 | ``` 69 | Static, s ::= 'static0' | 'static1' | ... 70 | Local, l ::= 'local0' | 'local1' | ... 71 | Param, p ::= 'param0' | 'param1' | ... 72 | LValue, lv ::= 'ret_slot' | Static | Local | Param 73 | ``` 74 | 75 | Types, constants, and rvalues are largely "what one would expect". 76 | Additionally, there is a (size-indexed) uninitialized type and an inscrutable constant inhabiting it, and an absurd type, with no inhabitants: 77 | ``` 78 | Size, N ::= '0b' | '1b' | '2b' | ... 79 | TypeBuiltin ::= 'Uninit' | ! | ... 80 | TypeParam, TP ::= 'TParam0' | 'TParam1' | ... 81 | TypeUserDef ::= 'User0' | 'User1' | ... 82 | Type, T ::= TypeBuiltin | TypeParam | TypeUserDef 83 | ``` 84 | The absurd type is so absurd it can be all sizes for all lvalues (thanks @RalfJung!); there is no need to give it a size index. 85 | 86 | ``` 87 | Constant, c ::= 'uninit':: | ... 88 | Operand, o ::= 'const'(Constant) 89 | | 'consume'(LValue) 90 | Unop, u ::= ... 91 | Binop, b ::= ... 92 | RValue, rv ::= 'use'(Operand) 93 | | 'unop'(Unop, Operand, Operand) 94 | | 'binop'(Binop, Operand, Operand) 95 | ``` 96 | 97 | An *lvalue context* represents the contents of each lvalue *at a node in the CFG*. 98 | A well-formed lvalue context must assign a type to each lvalue exactly once---it is conceptually a function or total map from lvalues to types. 99 | A *static context* is just an lvalue context restricted to statics. 100 | ``` 101 | LValueContext, LV ::= (LValue: Type)* 102 | StaticContext, S ::= (Static: Type)* 103 | ``` 104 | 105 | For now, we only will worry about `Copy` implementations, but more generally an *type context* keeps track of user defined types, type parameters, traits bounds. 106 | A trait bound can only involve user-defined types, in which case it represents an impl, or it can involve type parameters, in which case it represents a postulate from a where clause. 107 | [N.B. this roughly corresponds to rustc's `ParameterEnvironment`.] 108 | ``` 109 | Trait, tr ::= # some globally-unique identifier 110 | TraitBound, trb ::= Type ':' Trait 111 | TypePremise ::= TraitBound | TypeParam | TypeUserDef 112 | TypeContext, TC ::= TypePremise* 113 | ``` 114 | As a basic scoping rule, a parameter should come before any bound that uses it in a type context. 115 | 116 | Nodes in the CFG we will think of as continuations: they do not return, and while they don't take any arguments, the types of locations can be enforced as a prerequisite to them being called. 117 | ``` 118 | Label, k ::= 'enter' | 'exit' 119 | | 'k0' | 'k1' | ... 120 | Node, e ::= 'Assign'(LValue, Operand, Label) 121 | | 'DropCopy'(LValue, Label) 122 | | 'If'(Operand, Label, Label) 123 | | 'Call'(Operand, Label) 124 | | 'Unreachable' 125 | | ... # et cetera 126 | NodeType, ¬T ::= ¬(LValueContext) 127 | CfgContext, K ::= (Label : NodeType)* 128 | ``` 129 | 130 | And now the judgments. 131 | Operands and rvalues have not one but two lvalue contexts, an "in context" and "out context". 132 | This pattern allows us to thread some state. 133 | 134 | Constants can become rvalues whenever, and the in-context and out-context are only constrained to be equal. 135 | ``` 136 | Const: 137 | TC ⊢ c: T 138 | ──────────────────────────── 139 | TC; LV; LV ⊢ const(c): T 140 | ``` 141 | 142 | Consumption is more complex. 143 | We need to uninitialize the lvalue iff the type is !Copy. 144 | ``` 145 | MoveConsume: 146 | ─────────────────────────────────────────────────────────────── 147 | TC, T: !Copy; LV, lv: T; LV, lv: Uninit<_> ⊢ consume(lv): T 148 | ``` 149 | ``` 150 | CopyConsume: 151 | ────────────────────────────────────────────────────── 152 | TC, T: Copy; LV, lv: T; LV, lv: T ⊢ consume(lv): T 153 | ``` 154 | 155 | The actual threading of the state is in the rvalue intruducers. 156 | Note that the order of the threading does not matter---our state transformations are communicative. 157 | ``` 158 | Use: 159 | TC; LV₀; LV₁ ⊢ o: T 160 | ────────────────────────── 161 | TC; LV₀; LV₁ ⊢ use(o): T 162 | ``` 163 | ``` 164 | UnOp: 165 | TC; LV₀; LV₁ ⊢ o: T 166 | u: fn(T) -> Tᵣ # primops need no context 167 | ────────────────────────────── 168 | TC; LV₀; LV₁ ⊢ use(u, o): Tᵣ 169 | ``` 170 | ``` 171 | BinOp: 172 | TC; LV₀; LV₁ ⊢ o₀: T₀ 173 | TC; LV₁; LV₂ ⊢ o₁: T₁ 174 | b: fn(T₀, T₁) -> # primops need no context 175 | ─────────────────────────────────── 176 | TC; LV₀; LV₂ ⊢ use(b, o₀, o₁): Tᵣ 177 | ``` 178 | 179 | Nodes do not have two lvalue contexts, because viewed as continuations they don't return. 180 | The out contexts of their operand(s) instead constrain their successor(s). 181 | Assignment is perhaps the most important operation: 182 | ``` 183 | Assign: 184 | TC; S, LV, lv: Uninit<_>; S, LV, lv: Uninit<_> ⊢ rv: T 185 | TC; S; K ⊢ k: ¬(LV, lv: T) 186 | ─────────────────────────────────────────────────────────── 187 | TC; S; K ⊢ assign(lv, rv, k): ¬(LV, lv: Uninit<_>) 188 | ``` 189 | Note that the lvalue to be assigned must be uninitialized prior to assignment, and the rvalue must not affect it, so moving from an lvalue to itself is not prohibited. 190 | [Also note that making `K, k: _ ⊢ ...` the conclusion instead of making `... ⊢ k: _` a premise would work equally well, but this is easier to read.] 191 | 192 | Call resembles `Unop` and `Binop`, since its the moral equivalent for calling user-defined instead of primitive functions. 193 | Functions do have type parameters, so we must substitute type args for type parameters. 194 | While it's not too interesting now, it will become more interesting later. 195 | ``` 196 | Call: 197 | T₀ = for fn(T₁..Tₙ) -> Tᵣ where trb* 198 | ∀i. 199 | TC; S, LVᵢ; S, LVᵢ₊₁ ⊢ oᵢ : Tᵢ [Tₜₐ/TP]* 200 | TC, trb*; S; K ⊢ k: ¬(LVₙ₊₁, lv: Tᵣ) 201 | ──────────────────────────────────────────────────────────────── 202 | TC, trb*; S; K ⊢ call(lv, o*, k): ¬(LV₀, lv: Uninit<_>) 203 | ``` 204 | 205 | We can define diverging functions simply by never calling 'exit' and creating a cylic in the CFG instead. 206 | But when calling a non-terminating function, we still need to provide a successor. 207 | This unreachable node can be used whenever there exists an lvalue with type `!` 208 | Since that is the return type of a diverging function, after we return we will have an lvalue with that type (the return slot), and thus we can use this as a successor. 209 | This is also useful for unreachable branch in an enum match (corresponding to an absurd variant). 210 | ``` 211 | Unreachable: 212 | ────────────────────────────────────── 213 | TC; S; K ⊢ unreachable: ¬(LV₀, lv: !) 214 | ``` 215 | 216 | In this formulation, everything is explicit, so we also need to drop copy types (even if such a node is compiled to nothing) to mark them as uninitialized. 217 | ``` 218 | CopyDrop: 219 | TC, T: Copy; S; K ⊢ k: ¬(LV, lv: Uninit<_>) 220 | ──────────────────────────────────────────────── 221 | TC, T: Copy; S; K ⊢ drop(lv, k): ¬(LV, lv: T) 222 | ``` 223 | 224 | And here is `if`. 225 | I could go on, but hopefully the basic pattern is clear. 226 | ``` 227 | If: 228 | TC; S, LV₀; S, LV₁ ⊢ o: T 229 | TC; S; K ⊢ k₀: ¬(LV₁) 230 | TC; S; K ⊢ k₁: ¬(LV₁) 231 | ────────────────────────────────── 232 | TC; S; K ⊢ if(o, k₀, k₁): ¬(LV₀) 233 | ``` 234 | 235 | And finally, the big "let-rec" that ties the "knot" of continuations together into the CFG --- and a function. 236 | Every node in the CFG is postulated (node `eᵢ`, with type `¬Tᵢ`), and bound to a label (`k₀`). 237 | ``` 238 | Fn: 239 | k₀ = entry 240 | T₀ = ¬((s: Tₛ)*, (a: Tₚ)*, (l: Uninit<_>)*, ret_slot: Uninit<_>) 241 | ∀i. 242 | TC; # trait impls 243 | S; # statics 244 | l*; # locals (just the location names, no types) 245 | K, # user labels, K = { kₙ: ¬Tₙ | n } 246 | exit: ¬((s: Tₛ)*, (a: Uninit<_>)*, (l: Uninit<_>)*, ret_slot: Tᵣ); 247 | ⊢ eᵢ: ¬Tᵢ 248 | ───────────────────────────────────────────────────────────────────────────── 249 | TC; S ⊢ Mir { params, locals, labels: { (k: ¬T = e)* }, .. }: fn(Tₚ*) -> Tᵣ 250 | ``` 251 | Note the two special labels, 'enter' and 'exit'. 252 | 'enter' is defined like any other node, but must exist and match the function's signature. 253 | Specifically, it requires that all locals are uninitialized, and all parameters are initialized to match the type of the function. 254 | 'exit' isn't defined at all, but bound in the CfgContext so nodes can choose to exit as a successor. 255 | It requires that all locals and args are uninitialized, but the "return slot" is initialized. 256 | 257 | For completeness, we can parameterize the MIR with type parameters and trait bounds like this: 258 | ``` 259 | FnGeneric: 260 | TC, TP*, trb*; S ⊢ f: fn(Tₚ*) -> Tᵣ 261 | ─────────────────────────────────────────────────────────── 262 | TC; S ⊢ (Λ f): for fn(Tₚ*) -> Tᵣ where trb* 263 | ``` 264 | 265 | Ok, make sense? I've of course left many parts of the existing MIR unaccounted for: compound lvalues, lifetimes, references, panicking, mutability, aliasing, and more. 266 | Also, I only gave the introducers (I trust the MIRi devs to figure out the eliminators!). 267 | Yet this IR is pretty advanced in some other ways. 268 | Besides changing the types of locations, we have full support for linear types---note that everything must be manually deinitialized. 269 | Going forward, I'll extend this IR little by little until everything should be covered. 270 | 271 | [A final note, `CopyDrop` and `CopyConsume` could easily refer to different traits instead of both to `Copy`, to disassociate the ability to be memcopied from the ability to be forgotten. 272 | I have proposed splitting `Copy` like this in the past, but will not mention it again, as nothing else depends on this.] 273 | 274 | 275 | ## Enum Switch 276 | 277 | One thing that I haven't explicitly mentioned yet is the subtyping relation over continuation types. 278 | There is no width subtyping because the lvalue must assign a type to all lvalues. 279 | Otherwise, values could be forgotten without being dropped. 280 | There is (contravariant) depth-subtyping, however: 281 | ``` 282 | SubContLValue: 283 | b <: a 284 | ──────────────────────────── 285 | ¬(LV, lv: a) <: ¬(LV, lv: b) 286 | ``` 287 | the intuition being that a continuation can care only somewhat about an lvalue. 288 | Overall, nothing non-standard here. 289 | 290 | Why do I bring this up? Well, while the MIR is mostly safe, enum tagging currently needs a downcast in each branch. 291 | Now the downcast is adjacent to the switch, so it's not that hard to verify, but still it would be nice to have a single node with a safe interface. 292 | Enum variant subtypes have been proposed for a variety of reasons, but fit perfectly here. 293 | Enum switch becomes: 294 | ``` 295 | Switch: 296 | (∪ₙ Tₙ) :> T 297 | ∀i 298 | TC; S; K ⊢ kᵢ: ¬(LV, lv: Tᵢ) 299 | ──────────────────────────────────────────── 300 | TC; S; K ⊢ switch(lv, t, k*): ¬(LV, lv: T) 301 | ``` 302 | [On the first line, that's a union not the letter 'U'.] The union isn't me introducing union types (whew!), but just saying that these Tᵢ "cover" t. 303 | The nodes branched from the switch each expect a different variant, and the switch dispatches to the one expecting the variant that's actually there. 304 | 305 | 306 | ## Lifetimes 307 | 308 | Niko's recent blog posts cover this very well, so I will build on them. 309 | 310 | The "non-lexical" in "non-lexical lifetime" is valid, but leaves out the important detail that such lifetimes *are* lexical with respect to the MIR. 311 | Indeed that this is one of the main motivations for the MIR. 312 | Not only does this make lifetime inference as we currently do simpler, but it also means we can explicitly represent lifetimes. 313 | As I mentioned above, in formalisms like this, I believe everything should be explicit, and so lifetimes will be too. 314 | 315 | Lifetimes will be abstract function-global labels, just as lvalues have been defined as abstract function-global labels. 316 | Furthermore, just as lvalues can correspond to local variables or parameters, so lifetimes can exist internal to the function body or be parameters. 317 | Finally, there is one static lifetime, but many static lvalues (less symmetry, oh well). 318 | ``` 319 | LifetimeLocal, 'l ::= '\'local0' | '\'local1' | ... 320 | LifetimeParam, 'p ::= '\'param0' | '\'param1' | ... 321 | Lifetime, 'a ::= '\'static' | LifetimeLocal | LifetimeParam 322 | ``` 323 | 324 | As a first approximation, continuation types will be extended to include the set of lifetimes the node inhabits, hereafter called the "active lifetimes". 325 | ``` 326 | LifetimeContext, LC ::= Lifetime* 327 | NodeType, ¬T ::= ¬(LValueContext; LifetimeContext) 328 | ``` 329 | As lvalues contexts must be proper maps, so lifetime contexts must be proper sets when invoking any judgment. 330 | All node introducers so far will be modified to simply propagate the lifetime context: whatever lifetimes include a node's successors will also include the node itself. 331 | Similarly, there is no continuation subtyping related to "active liftimes", they must match exactly. 332 | 333 | Now, everything so far, alone, does render lifetimes worthless, because all nodes would inhabit the same lifetimes! 334 | To remedy this, we'll have dedicated nodes to begin and end lifetimes: their single successor will have one more or less active lifetime than they do. 335 | For now, lets define them as: 336 | ``` 337 | LifetimeBegin: 338 | TC; S; K ⊢ k: ¬(LV; LC, 'l) 339 | ──────────────────────────────────── 340 | TC; S; K ⊢ begin('l, k): ¬(LV; LC) 341 | ``` 342 | ``` 343 | LifetimeEnd: 344 | 'l ∉ LV 345 | TC; S; K ⊢ k: ¬(LV; LC) 346 | ────────────────────────────────────── 347 | TC; S; K ⊢ end('l, k): ¬(LV; LC, 'l) 348 | ``` 349 | The additional premise, that `'l` is not in any (current) type of any location, should ensure that values do not outlive their lifetime. 350 | It would be nicer to use a "strict-outlives" relation, but we don't have that. 351 | This is a bit unduly restrictive if we choose to support contravariant lifetimes, but either way it is sound. 352 | Finally, recall that `'l` is a meta-variable for local lifetimes. Parameter lifetimes and `'static` cannot be begun or ended. 353 | 354 | I should remark on the design principles that led me to this formalism. 355 | Niko's [third blog post](http://smallcultfollowing.com/babysteps/blog/2016/05/09/non-lexical-lifetimes-adding-the-outlives-relation/) goes over the limitations of single-entry-multiple-exit lifetimes defined with dominators, and the series instead opts to define lifetimes as subsets of CFG nodes which satisfy some conditions. 356 | He also pointed out that lifetimes could be thought of as "sets of paths" through a graph. 357 | I like the imagining lifetimes as "sets of paths", and that leads me to believe we should focus on the path's endpoints more than their interiors. 358 | A set of nodes focuses on active lifetimes and leaves the lifetime boundary implicit, but having explicit start/end nodes does the opposite. 359 | Also, while one could make variants of nodes that introduce lifetimes so we need not introduce extra "no-op" begin/end nodes, that would lead to either explosion of rules, or more complicated rules. 360 | 361 | Niko's [second blog post](http://smallcultfollowing.com/babysteps/blog/2016/05/04/non-lexical-lifetimes-based-on-liveness/) on non-lexical lifetimes proposes that instead of requiring the type of a binding to outlive the *scope* of the binding, we should merely require that it outlives the *live-range* of the binding. 362 | In my formulation, that is a particularly natural strategy. 363 | Consider that only during the lifetime in which a variable is live will its lvalue have that (initialized) type. 364 | Its last use is in fact its destruction, after which the lvalue's type is `Uninit<_>`. 365 | To me, this signifies that the "moral equivalent" of a scope for this IR is in fact exactly this "live-range" lifetime. 366 | While it is possible to derive lifetimes based on scopes in the surface language, they are fairly meaningless. 367 | 368 | Niko's third blog post also goes over the need for an "outlives-at" relation. 369 | The basic idea is that we often (always?) only care which lifetime ends later, and don't care which began earlier. 370 | More on that for a bit, but for now I'll explain how this can justify introducing references within an already active lifetime. 371 | Obviously the newly-introduced reference didn't exists from the beginning of the lifetime, but as long as it is destroyed before the lifetime ends, the lifetime *outlives* the reference *at* the moment of introduction. 372 | Because lifetimes are based on scopes today, we are already allow the bounding lifetime of a reference to extend past the reference's last use, and thus can be confident that a dedicated sort of node just to end lifetimes is sufficiently expressive. 373 | If allowing the creation of references in already active lifetimes is indeed sound, then a dedicated sort of node to begin lifetimes works too. 374 | 375 | ### Outliving 376 | 377 | Now, let us talk about the outlives relationship in detail. 378 | I tried to think of a situation where the new "outlives-at" relation wouldn't suffice, and the old "outlives" relation was needed, but I failed. 379 | The rules I came up with, keep the "at" implicit however, so our `x: y` syntax will remain the same. 380 | 381 | As before, a lifetime can bound another lifetime or a type. 382 | Lifetime bounds are bundled in a *bound context* [running out of names, I am]. 383 | ``` 384 | LifetimeBound, lb ::= Lifetime ':' Lifetime 385 | | Type ':' Lifetime 386 | BoundContext, BC ::= LifetimeBound* 387 | ``` 388 | 389 | Recall the outlives relation defined in [RFC 1214](https://github.com/rust-lang/rfcs/blob/master/text/1214-projections-lifetimes-and-wf.md). 390 | It doesn't concern itself with surface language terms or the CFG, but simply the derivation of outlive bounds from other outlive bounds. 391 | We will use it here (with the slight exception of a different rule for unique references, as those will be defined differently when they are introduced in a later section). 392 | Where the judgements refer to "the environment", we will make that precise by using the bound context. 393 | Thus, prepend all judgements with `BC; ` in the original rules, like 394 | ``` 395 | BC; R ⊢ 'foo: 'bar 396 | ``` 397 | The reasoning behind keeping that "at" part implicit is that when working with outlives-at judgements that all share the same "at", I believe one can ignore the position altogether. 398 | If that is not the case, RF 1214's outlives will indeed need to be further modified. 399 | 400 | Unfortunately, we must modify continuation types again, giving them---you guessed it---a bound context. 401 | ``` 402 | NodeType, ¬T ::= ¬(LValueContext; LifetimeContext; BoundContext) 403 | ``` 404 | Again, all existing rules will blindly propagate the bound context to their successors. 405 | But, this time there is a new subtyping rule: 406 | ``` 407 | SubContOblig: 408 | ∀lb ∈ OB₁. OB₀; <> ⊢ lb 409 | ──────────────────────────────── 410 | ¬(LV; LC; OB₁) <: ¬(LV; LC; OB₀) 411 | ``` 412 | Note the the 0 and 1 subscripts are reversed from what one might expect. 413 | Instead of imaging the bounds as *proofs our continuations require*, imagine them as 414 | *obligations our continuations discharge*. 415 | So, given what (all) successors are obligated to carry out, one can deduce bounds using RFC 1412's outlives, and give those new bounds as obligations the current node can carry out. 416 | 417 | Why all this? 418 | Consider that there is no evidence in the "past" or "present" with which to prove the outlives-at relation. 419 | The best one can do is charge all successors to witness `'b` dying no later than `'a` for `'a: 'b`. 420 | Consider also that if we had stayed with the original outlives relation, each node would both obligate its successors and require a proof from its predecessors. 421 | 422 | Hopefully it is clear now that we will need to modify `LifetimeEnd` to discharge these obligations: 423 | ``` 424 | LifetimeEnd: 425 | 'l ∉ LV 426 | TC; S; K ⊢ k: ¬(LV; LC; BC) 427 | ─────────────────────────────────────────────────────────────── 428 | TC; S; K ⊢ end('l, k): ¬(LV; LC, 'l; BC, {'l: 'a | 'a ∈ LC }) 429 | ``` 430 | The set-builder notation says we discharge obligations for each lifetime in `LC`. 431 | (Of course, it should be fine if the node's predecessors doesn't care about every lifetime in `LC` being outlived. The subtyping rule takes care of that.) 432 | 433 | We can now redefine `Call`, `Fn`, and `FnGeneric`. 434 | `Call` has three additional jobs. 435 | First, it needs to provide lifetime arguments drawn from the set of active lifetimes. 436 | Second, it needs to substitute those as well for lifetime parameters in the argument types. 437 | Third, it needs to propagate obligations to satisfy the lifetime bounds from the where clause. 438 | ``` 439 | Call: 440 | TC ⊢ trb* 441 | OB ⊢ lb* 442 | T₀ = for fn(T₁..Tₙ) -> Tᵣ where trb*, lb* 443 | ∀i. 444 | TC; S, LVᵢ; S, LVᵢ₊₁ ⊢ oᵢ : Tᵢ [Tₜₐ/TP]* ['a/'p]* 445 | TC; S; K ⊢ k: ¬(LVₙ₊₁, lv: Tᵣ; LC; OB) 446 | ─────────────────────────────────────────────────────────────────────── 447 | TC; S; K ⊢ call(lv, o*, k): ¬(LV₀, lv: Uninit<_>; LC; OB) 448 | ``` 449 | [I switched from `TC, trb* ⊢ ...` to making `TC ⊢ trb*` a separate premise just for legibility.] 450 | 451 | `Fn`, not `FnGeneric` is responsible for lifetime parameters and lifetime bounds. 452 | lifetime parameters and `'static` become active lifetimes for `enter` and `exit`. 453 | `exit` also satisfies obligations for all lifetime bounds, allowing the bounds to be propagated backwards into the rest of the CFG. 454 | ``` 455 | Fn: 456 | k₀ = entry 457 | T₀ = ¬((s: Tₛ)*, (a: Tₚ)*, (l: Uninit<_>)*, ret_slot: Uninit<_>; 'static, 'p*) 458 | ∀i. 459 | TC; # trait impls 460 | S; # statics 461 | l*; # locals (just the location names, no types) 462 | K, # user labels, K = { kₙ: ¬Tₙ | n } 463 | exit: ¬((s: Tₛ)*, (a: Uninit<_>)*, (l: Uninit<_>)*, ret_slot: Tᵣ; 'static, 'p*; BC); 464 | ⊢ eᵢ: ¬Tᵢ 465 | ───────────────────────────────────────────────────────────────────────────────────────── 466 | TC; S ⊢ Mir { params, locals, labels: { (k: ¬T = e)* }, .. }: 467 | for<'p*> fn(Tₚ*) -> Tᵣ where BC 468 | ``` 469 | 470 | `FnGeneric` is basically the same but prepends type parameters to lifetime parameters and prepends trait bounds to lifetime bounds: 471 | ``` 472 | FnGeneric: 473 | TC, TP*, trb*; S ⊢ f: for<'p*> fn(Tₚ*) -> Tᵣ where BC 474 | ──────────────────────────────────────────────────────────────────── 475 | TC; S ⊢ (Λ f): for fn(Tₚ*) -> Tᵣ where trb*, BC 476 | ``` 477 | 478 | I should note that the outlives-at relation defined this way has some perhaps surprising consequences. 479 | Consider a CFG of a single loop with two lifetimes twice overlapping. 480 | Cut and unrolled, the loop looks like: 481 | ``` 482 | 'a ┌─────────────────────┐ 483 | ┅━━━━━┷━━━━┯═══════════╤════╧━━━━━┅ 484 | ┄──────────┘ 'b └──────────┄ 485 | 486 | → CFG edge direction (time) 487 | ━ 'a: 'b 488 | ═ 'b: 'a 489 | ``` 490 | the imaginary cut is at the dotted lines, and the brackets labeled `'a` and `'b` denote each lifetime. 491 | Moving from left to right, after `'b` ends and until `'a` ends, `'a` can derive that `'b: 'a`, because `'b` is alive at `'a`'s ending. 492 | But during the other half of the CFG loop, the opposite can be derived! 493 | This shows that no precaution is taken against lifetimes resurrecting after they are presumed dead. 494 | This might seem highly dangerous, but note that when a lifetime dies, nothing associated with it can still exist because no lvalue can include it in its type. 495 | Thus, no pointer invalidation can occur. 496 | 497 | To end the section, I want to close with Niko's most unruly example from the third blog. 498 | In Rust, it looks like: 499 | 500 | > ```rust 501 | let mut map1 = HashMap::new(); 502 | let mut map2 = HashMap::new(); 503 | let key = ...; 504 | let map_ref1 = &mut map1; 505 | let v1 = map_ref1.get_mut(&key); 506 | let v0; 507 | if some_condition { 508 | v0 = v1.unwrap(); 509 | } else { 510 | let map_ref2 = &mut map2; 511 | let v2 = map_ref2.get_mut(&key); 512 | v0 = v2.unwrap(); 513 | map1.insert(...); 514 | } 515 | use(v0); 516 | > ``` 517 | 518 | As our CFG it looks like: 519 | 520 | ``` 521 | A [ map1 = HashMap::new() ] 522 | 1 [ map2 = HashMap::new() ] 523 | 2 [ key: K = ... ] 524 | 3 [ begin('x) ] 525 | 3 [ map_ref1 = &mut map1 ] 526 | 4 [ v1 = map_ref1.get_mut(&key) ] 527 | 5 [ if some_condition ] 528 | | | 529 | true false 530 | | | 531 | v v 532 | B [ v0 = v1.unwrap() ] C [ destroy(v0) ] 533 | 1 [ goto ] 1 [ end('x) ] 534 | | 2 [ begin('x) ] 535 | | 3 [ map_ref2 = &mut map2 ] 536 | | 4 [ v2 = map_ref2.get_mut(&key) ] 537 | | 5 [ begin('x) ] 538 | | 6 [ v0 = v2.unwrap() ] 539 | | 7 [ map1.insert(...) ] 540 | | 8 [ goto ] 541 | | | 542 | v v 543 | D [ use(v0) ] 544 | 1 [ end('x) ] 545 | ``` 546 | 547 | As is shown, we can build this with a single lifetime! 548 | Start before `A3`, to include `v1`. 549 | Give `V0`'s ref the same lifetime on both branches. 550 | On the right branch, end and begin again before `C3`; as the next section will demonstrate, this "clears" the borrow on `map1`. 551 | Finally, begin again before assigning `v0` so that it can be given the lifetime as prescribed above. 552 | Note that `B1` and `C8` have different sets of lvalues borrowed before the merge at `D0`; that's fine. 553 | One can imagine borrowing and then immediately throwing away the reference, but the location must stay borrowed until the lifetime the referenced was associated with ends. 554 | Thus it fine to skip the intermediate step, and allow one to coerce lvalues to their borrowed equivalents. 555 | 556 | 557 | ## Unique References (Generalizing `&mut`) 558 | 559 | Lvalue typestate might be the great enabler of everything else proposed in this document, but is somewhat useless on its own. 560 | With references as they exist today, all uninitialized lvalues correspond to locals, and the one thing one can do with them---write to them---is already supported. 561 | As to changing the type of locals, one can simply use more locals and hope an optimization pass will figure out how to reuse stack slots. 562 | 563 | But consider how we might generalize unique references. 564 | In brief, references provide access to locations, so one can convert a reference to an lvalue (deref), or an lvalue to a reference (ref). 565 | That much will stay the same between the existing MIR and this. 566 | With unique references in particular, for every reference created, one preexisting lvalue is made unusable, so there is never more than one way to access the "real" location. 567 | Thus, in some sense, a unique reference is as good as "direct" access to the original lvalue. 568 | It would thus be nice if one could do everything via a unique reference that can be done with a "top-level" lvalue (i.e. one *not* from a deref). 569 | With first class initializedness, that "everything" boils down to changing the type of references' contents. 570 | 571 | A first approximation of this might keep the existing `&mut T` type, and simply change a reference's type parameter along with its backing lvalue on writes and move-outs. 572 | There are two problems with this, however. 573 | First, when a function takes some unique reference arguments, the backing lvalues aren't in scope, so there's no way to change them. 574 | The caller would have no idea what the callee is doing with the references. 575 | Second, because of control flow merges, it may not be known which lvalue a reference points to, and thus which lvalues' types to update. 576 | For example, see `D0` in the CFG at the end of the lifetimes section. 577 | `v0` may point in either `map1` or `map2`, there's no way of knowing! 578 | 579 | The solution is to give unique references *two* type parameters. 580 | I will use the syntax `&mut<'a, Tᵢ, tₒ>`. 581 | The first parameter is the current type or input type, and represents what's currently in the borrowed location. 582 | The second is the residual type or output type, and represents what must be there when the reference is dropped. 583 | This solves the first problem because functions' types now state the condition they will leave any borrowed locations in. 584 | This also solves the second problem because when we mark a location borrowed, we can simultaneously set it to the newly-created reference's residual type. 585 | Doing so ensures that on the branch that the location *wasn't* mutated via the reference, it already has the type it would have had it been. 586 | 587 | The subtyping rule is as follows: 588 | ``` 589 | SubUniqRef: 590 | 'a₀ <: 'a₁ 591 | Tᵢ₀ <: Tᵢ₁ 592 | ────────────────────────────────────────── 593 | &mut<'a₀, Tᵢ₀, tₒ> <: &mut<'a₁, Tᵢ₁, tₒ> 594 | ``` 595 | Unique references are covariant with respect to the first parameter because it affects the first *read*. 596 | They would be contravariant with respect to the the second parameter because it affects the last *write*, but since we don't support contravariance they are invariant instead. 597 | 598 | The well-formedness (from RFC 1214) rule is as follows: 599 | ``` 600 | WfUniqRef: 601 | R ⊢ Tᵢ WF 602 | R ⊢ Tₒ WF 603 | ─────────────────────── 604 | R ⊢ &mut<'a, Tᵢ, Tₒ> WF 605 | ``` 606 | Note that neither type argument need outlive the lifetime of the reference! 607 | This is a fairly subtle point that deserves some remark. 608 | The lifetime of the unique reference is the lifetime that the *location* is borrowed; the *contents* of that location is fully owned by the reference. 609 | It could well be that the contents cannot outlive a different lifetime that ends earlier, 610 | and thus must be destroyed first. 611 | That's no problem, because as was already stated, the unique reference allows one to change the type of the contents just as one can do with a top-level lvalue. 612 | The residual type must outlive the reference's lifetime *at* the moment when the reference is dropped, but symmetrically that type may not be inhabited when the reference was created. 613 | 614 | The outlives (again from RFC 1214) is: 615 | ``` 616 | OutlivesUniqRef: 617 | R ⊢ 'a₀: 'a₁ 618 | R ⊢ Tᵢ: 'a₁ 619 | R ⊢ Tₒ: 'a₁ 620 | ────────────────────────── 621 | R ⊢ &mut<'a₀, Tᵢ, Tₒ>: 'a₁ 622 | ``` 623 | Requiring `Tᵢ: 'a₁` is easy to decide on as it imposes no onerous restriction. 624 | If the reference is left containing `Tᵢ` after `Tᵢ` is no longer alive, and the lifetime(s) in which `Tᵢ` is alive does not begin again until after the reference must be dropped (if it begins again at all), the reference is incapable of being dropped. 625 | `Tₒ: 'a₁` while somewhat conservative, follows from making the output type parameter invariant instead of contravariant. 626 | It also matches the outlive rules for traits objects and function types. 627 | 628 | To ensure that the residual type does indeed reflect the location pointed to by the reference before it is dropped, we only allow dropping unique references when the current type matches the residual type. 629 | ``` 630 | DropUniqRef: 631 | TC, T; S; K ⊢ k: ¬(LV, lv: Uninit<_>; LC; BC) 632 | ─────────────────────────────────────────────────────────────── 633 | TC, T; S; K ⊢ drop(lv, k): ¬(LV, lv: &mut<'a, T, T>; LC; BC) 634 | ``` 635 | Notice that both type arguments are `T`. 636 | 637 | Besides unique reference types themselves, there are two more constructs that must be introduced: a borrowed type, and projections. 638 | Uniquely borrowing a location in Rust prevents all interaction with that location except through the reference. 639 | Currently we prevent access to the borrowed lvalue through more static analysis, just as we prevent access to uninitialized lvalues. 640 | And just as I instead opted for an uninitialized type family, so I will opt for a borrowed type family: `BorrowedMut<'a, T>`. 641 | `BorrowedMut<'a, T>` is a simple newtype around `T`, "protecting" `T` during `'a`. 642 | `'a` would be contravariant and so is invariant. 643 | ``` 644 | SubBorrowedMut: 645 | T₀ <: T₁ 646 | ──────────────────────────────────────── 647 | BorrowedMut<'a, T₀> <: BorrowedMut<'a, T₁> 648 | ``` 649 | The reason for the would-be contravariance is simple, while a `&mut<'a, _, T>` must be destroyed before `'a` ends, its associated `BorrowedMut<'a, T>` must turn back into `T` after `'a` ends, to guarantee that aliasing is prevented. 650 | 651 | Because `BorrowedMut<'a, T>` turns back into `T` after `'a`, the well-formedness rule has the addition restriction that `T: 'a`. 652 | ``` 653 | WfBorrowedMut: 654 | R ⊢ T WF 655 | R ⊢ T: 'a 656 | ───────────────────────── 657 | R ⊢ BorrowedMut<'a, T> WF 658 | ``` 659 | [Just like with `LifetimeEnd`, it might be nice to use a strict outlives relation here, but we don't have it at our disposal. 660 | Thankfully not using strict outlives won't create any soundness problems but rather simply delay the catching of errors.] 661 | 662 | Outlives for `BorrowedMut` retains a shred of contravariance in that it outlives its parameter (and must do so for reasons explained above). 663 | Hopefully this doesn't cause any problems. 664 | ``` 665 | OutlivesBorrowedMut: 666 | R ⊢ T: 'a₁ 667 | ───────────────────────── 668 | R ⊢ BorrowedMut<'a₀, T>: 'a₁ 669 | ``` 670 | 671 | Additionally, any unborrowed lvalue is a subtype of its borrowed equivalent. 672 | This is needed for control flow merges as I talked about above; the shared successor needs to conservatively prohibit access to potentially-borrowed locations. 673 | (In the example above `map1` and `map2` would need to be borrowed at `D0`.) 674 | ``` 675 | SubBorrowUniquely: 676 | ────────────────────────────────────── 677 | T <: BorrowedMut<'a, T₁> 678 | ``` 679 | Note, I think it should be possible to achieve this with an "lvalue cast" instead of subtyping if that is desired. 680 | 681 | The crucial attribute of `BorrowedMut` is that lvalues of it cannot be inspected in any way. 682 | There has been some talk of non-movable types, so I will presume the existence of a `Move` trait whose sole purpose is to gate the use the consume operand. 683 | While we're at it, one might as well also make `Uninit<_>: !Move` so the `uninit` constant can be dispensed with. 684 | If the new trait sounds like mission creep, it is perfectly possible to also just dispense with the `Move` trait and special-case prohibit `BorrowedMut` (and `Uninit` too) from being consumed. 685 | 686 | Now we have enough in place for unique reference introduction. 687 | References are introduced as an rvalue, with the same grammar as today. 688 | ``` 689 | BorrowKind ::= 'unique' | 'aliased' 690 | RValue, rv ::= 'use'(Operand) 691 | | 'unop'(Unop, Operand, Operand) 692 | | 'binop'(Binop, Operand, Operand) 693 | | 'ref'(Lifetime, BorrowKind, LValue) 694 | ``` 695 | Because we can only introduce references for lifetimes that are active, we will need to additionally route the lifetime context to all rvalue introducers. 696 | (This is done for the others in the appendix.) 697 | ``` 698 | RefUnique: 699 | ─────────────────────────────────── 700 | TC; 701 | LV, lv: Tᵢ; 702 | LV, lv: BorrowedMut<'a, Tₒ>; 703 | LC, 'a 704 | ⊢ ref('a, unique, lv): &mut<'a, Tᵢ, BorrowedMut<'a, Tₒ>> 705 | ``` 706 | As I said earlier, we change the type of the lvalue to the residual type, ahead of a value of that type actually being written to the reference. 707 | That is manifested as `lv: BorrowedMut<'a, T₀>` in the output lvalue context. 708 | One thing I didn't mention, and I didn't initially expect, is that the residual type would *also* be borrowed. 709 | The reason for this is that when reborrowing an indeterminate number of times (e.g. when descending through a tree), it is impossible to keep all intermediate references around, for that would take an indeterminate number of lvalues. 710 | As the deref rule will make far clearer, this allows intermediate references to be dropped. 711 | Note finally that because one can borrow an lvalue without taking a reference, this doesn't constrain consumers who don't reborrow the reference. 712 | 713 | And the last prerequisite, projections. Projections are rustc's umbrella concept for getting lvalues from lvalues. 714 | Examples include (primitive) array indexing, field access, and deref. 715 | ``` 716 | Projection, prj ::= 'deref' 717 | LValue, lv ::= 'ret_slot' | Static | Local | Param 718 | | Projection(LValue) 719 | ``` 720 | Deref is all we need to concern ourselves with for now. 721 | 722 | Ok, finally everything is in place to introduce the deref rule. 723 | This rule is at the heart of what references do, and perhaps takes the least expected form. 724 | Considering that projections extract lvalues from lvalues, one might think the deref rule would look like: 725 | ``` 726 | LV ⊢ lv: &mut<'a, Tᵢ, Tₒ> 727 | ────────────────────── 728 | LV ⊢ deref(lv) : Tᵢ 729 | ``` 730 | or similarly: 731 | ``` 732 | ────────────────────────────────────── 733 | LV, lv: &mut<'a, Tᵢ, Tₒ> ⊢ deref(lv) : Tᵢ 734 | ``` 735 | This, however, breaks because node introducers themselves (e.g. assign) "change" the lvalue context (i.e. nodes' successors often take a different lvalue context than the nodes themselves). 736 | Those changes need to be propagated back to types of the references being dereferenced. 737 | Instead we will use this rule: 738 | ``` 739 | WithDeref: 740 | TC, T; S; 741 | { k: ¬(LVᵢ, deref(lv): Tᵢ; LCᵢ; BCᵢ) | i ≥ 1 } 742 | ⊢ k': ¬(LV₀, deref(lv): T₀; LC₀; BC₀) 743 | ───────────────────────────────────────────────────────────────────── 744 | TC, T; S; 745 | K, { k: ¬(LVᵢ, lv: &mut<'a, Tᵢ, Tᵣ>; LCᵢ; BCᵢ) | i ≥ 1 } 746 | ⊢ with_deref(k)': ¬(LV₀, lv: &mut<'a, T₀, Tᵣ>; LC₀; BC₀) 747 | ``` 748 | This convoluted rule allows one to build a node where it and its successors just see the dereferenced lvalue instead of the reference itself. 749 | Then, provided we can build real successors which see a reference instead, this node too can be built with the reference instead. 750 | The type of the deref lvalue becomes the current type, and the type of the residual type can be anything but must not change. 751 | Let's put this into context with assign. 752 | Say we want to move out of a unique reference. 753 | The premised assign node expects `deref(lv): T`, and its single successor expects `deref(lv): Uninit<_>`. 754 | Then the concluded `with_deref(assign(..))` node expects `lv: &mut<'a, T, Whatever>`, and its successor expects `lv: &mut<'a, Uninit<_>, Whatever>`. 755 | Thus, we've moved out of a reference! 756 | 757 | Another crucial example is reborrowing. 758 | First, it's instructive to see how the current and residual types of the references are "threaded together". 759 | The new reference's (initial) current type is the old reference's (previous) current type, and the new reference's residual type becomes the old reference's (new) current type. 760 | Pictorially this looks like: 761 | ``` 762 | ('a: 'b) 763 | ┌───────────────────────────────────────────────────────────┐ 764 | │&mut<'b, T₀, BorrowedMut>│ 765 | └──────────┬────────────────────────────────────┬───────────┘ 766 | ↑ ↓ 767 | ┌──────────────┴──────────────────┐ ┌─────────────┴─────────────────────────────┐ 768 | │&mut<'a, T₀, BorrowedMut>├ → ┤&mut<'a, BorrowedMut, BorrowedMut<>│ 769 | └─────────────────────────────────┘ └───────────────────────────────────────────┘ 770 | ``` 771 | On top is the new reference created from the reborrow. 772 | On bottom is the old reference, before and after the reborrow. 773 | See now also what I crudely described before when introducing the unique reference dropping rule. 774 | If `T₁ = T₂`, then the old reference can be dropped before the new reference, allowing an indefinite chain of reborrowing with no more than 2 references alive at a time. 775 | 776 | Second, it is good to be aware of the increase in capabilities on reborrowed references between the status quo and this proposal. 777 | If the dereferenced lvalue is borrowed and assigned, the successor's reborrowed reference (not the newly created reference!) will look like: 778 | ``` 779 | &mut<'old, BorrowedMut<'new, T>, T> 780 | ``` 781 | whereas today, one would end up with the moral equivalent of 782 | ``` 783 | BorrowedMut<'new, &mut<'old, T, T>> 784 | ``` 785 | this is the difference between (on top, my plan) a reference where the *referenced location* is borrowed, and (on bottom, status quo) a borrowed location holding a reference. 786 | The latter can only be dropped, but the former can be moved around too. 787 | I don't have a complete plan, but this opens the door to storing a reborrowed reference and the reference that reborrows it together in the same struct. 788 | This is related to the [first wishlist item](https://github.com/tomaka/vulkano/blob/master/TROUBLES.md) of Tomaka's Vulkano library. 789 | [I do have a plan for making Box and other owning pointers unique reference newtypes. 790 | I will present it a few sections later.] 791 | 792 | ### Other proposals and the surface language 793 | 794 | Finally, before I end this section, a few words on how this relates to other proposals. 795 | `&mut`, `&move`, and `&out` are but special case of `&mut<_, _, _>`: 796 | ``` 797 | &'a mut T = &mut<'a, T, T> 798 | &'a move T = &mut<'a, T, Uninit<_>> 799 | &'a out T = &mut<'a, Uninit<_>, T> 800 | ``` 801 | The subtyping rule for `&mut<_, _, _>` likewise implies the subtyping rules for the three. 802 | Because unique references' current type parameter is covariant, `&move`'s parameter is also covariant. 803 | Because unique references' residual type parameter is invariant (or contravariant), `&out`'s parameter is invariant (or contravariant). 804 | Invariance overrides the others (or covariance and contravariance cancel out), so `&mut`'s parameter is invariant. 805 | Additionally, there has been some interest in relaxing the well-formedness restriction on `&mut`. 806 | While I do not know if this is backwards compatible, this proposal would indicate that it can be done soundly. 807 | I should finally note that the chief criticism of these proposals is the worry that by introducing more primitive reference types, we'll open the floodgates and end up with an overflowing menagerie of confusing and non-orthogonal primitive pointer types. 808 | Well, I am very pleased to report that with this proposal there are and will be no new pointer types, as `&mut T` becomes but a synonym for its generalization. 809 | 810 | Thanks to default type parameters, we can backwards-compatibly extend `DerefMut` to support any unique reference: 811 | ```rust 812 | pub trait DerefMut: Deref { 813 | type ResidualTarget = Self::Target; 814 | fn deref_mut(self: &mut) -> &mut; 815 | } 816 | ``` 817 | This removes the need for any `DerefMove` trait or similar. 818 | 819 | 820 | ## Appendix: Grammar and Rules in Full 821 | 822 | All the extension and modifications of each section, squashed together. 823 | 824 | ### Grammar 825 | 826 | ``` 827 | anything* ::= | anything ',' anything* 828 | ``` 829 | 830 | ``` 831 | Static, s ::= 'static0' | 'static1' | ... 832 | Local, l ::= 'local0' | 'local1' | ... 833 | Param, p ::= 'param0' | 'param1' | ... 834 | Projection, prj ::= 'deref' 835 | LValue, lv ::= 'ret_slot' | Static | Local | Param 836 | | Projection(LValue) 837 | ``` 838 | ``` 839 | LifetimeLocal, 'l ::= '\'local0' | '\'local1' | ... 840 | LifetimeParam, 'p ::= '\'param0' | '\'param1' | ... 841 | Lifetime, 'a ::= '\'static' | LifetimeLocal | LifetimeParam 842 | ``` 843 | ``` 844 | Size, n ::= '0b' | '1b' | '2b' | ... 845 | TypeBuiltin ::= 'Uninit' | ! 846 | | &'mut' | 'BorrowedMut' 847 | | ... 848 | TypeParam, TP ::= 'TParam0' | 'TParam1' | ... 849 | TypeUserDef ::= 'User0' | 'User1' | ... 850 | Type, T ::= TypeBuiltin | TypeParam | TypeUserDef 851 | ``` 852 | ``` 853 | Constant, c ::= 'BorrowedMut'::(Constant) | ... 854 | Operand, o ::= 'const'(Constant) 855 | | 'consume'(LValue) 856 | Unop, u ::= ... 857 | Binop, b ::= ... 858 | BorrowKind ::= 'unique' | 'aliased' 859 | RValue, rv ::= 'use'(Operand) 860 | | 'unop'(Unop, Operand, Operand) 861 | | 'binop'(Binop, Operand, Operand) 862 | | 'ref'(Lifetime, BorrowKind, LValue) 863 | ``` 864 | ``` 865 | LValueContext, LV ::= (LValue: Type)* 866 | StaticContext, S ::= (Static: Type)* 867 | ``` 868 | ``` 869 | Trait, tr ::= # some globally-unique identifier 870 | TraitBound, trb ::= Type ':' Trait 871 | TypePremise ::= TraitBound | TypeParam | TypeUserDef 872 | TypeContext, TC ::= TypePremise* 873 | ``` 874 | ``` 875 | LifetimeBound ::= Lifetime ':' Lifetime 876 | | Type ':' Lifetime 877 | BoundContext, BC ::= LifetimeBound* 878 | ``` 879 | ``` 880 | Label, k ::= 'enter' | 'exit' 881 | | 'k0' | 'k1' | ... 882 | Node, e ::= 'Assign'(LValue, Operand, Label) 883 | | 'DropCopy'(LValue, Label) 884 | | 'If'(Operand, Label, Label) 885 | | ... # et cetera 886 | LifetimeContext, LC ::= Lifetime* 887 | NodeType, ¬T ::= ¬(LValueContext; LifetimeContext; BoundContext) 888 | CfgContext, K ::= (Label : NodeType)* 889 | ``` 890 | 891 | ### Rules 892 | 893 | #### Operand Introduction Rules 894 | ``` 895 | Const: 896 | TC ⊢ c: T 897 | ──────────────────────────── 898 | TC; LV; LV ⊢ const(c): T 899 | ``` 900 | ``` 901 | MoveConsume: 902 | ────────────────────────────────────────────────────────────────────── 903 | TC, T: Move + !Copy; LV, lv: T; LV, lv: Uninit<_> ⊢ consume(lv): T 904 | ``` 905 | ``` 906 | CopyConsume: 907 | ────────────────────────────────────────────────────── 908 | TC, T: Copy; LV, lv: T; LV, lv: T ⊢ consume(lv): T 909 | ``` 910 | 911 | #### RValue Introduction Rules 912 | ``` 913 | Use: 914 | TC; LV₀; LV₁ ⊢ o: T 915 | ────────────────────────── 916 | TC; LV₀; LV₁; LC ⊢ use(o): T 917 | ``` 918 | ``` 919 | UnOp: 920 | TC; LV₀; LV₁ ⊢ o: T 921 | u: fn(T) -> Tᵣ # primops need no context 922 | ────────────────────────────── 923 | TC; LV₀; LV₁; LC ⊢ use(u, o): Tᵣ 924 | ``` 925 | ``` 926 | BinOp: 927 | TC; LV₀; LV₁ ⊢ o₀: T₀ 928 | TC; LV₁; LV₂ ⊢ o₁: T₁ 929 | b: fn(T₀, T₁) -> # primops need no context 930 | ─────────────────────────────────── 931 | TC; LV₀; LV₂; LC ⊢ use(b, o₀, o₁): Tᵣ 932 | ``` 933 | ``` 934 | RefUnique: 935 | ─────────────────────────────────── 936 | TC; 937 | LV, lv: Tᵢ; 938 | LV, lv: BorrowedMut<'a, Tₒ>; 939 | LC, 'a 940 | ⊢ ref('a, unique, lv): &mut<'a, Tᵢ, BorrowedMut<'a, Tₒ>> 941 | ``` 942 | 943 | #### Node/Continuation Introduction Rules 944 | ``` 945 | Assign: 946 | TC; S, LV, lv: Uninit<_>; S, LV, lv: Uninit<_>; LC ⊢ rv: T 947 | TC; S; K ⊢ k: ¬(LV, lv: T; LC; BC) 948 | ───────────────────────────────────────────────────────────────── 949 | TC; S; K ⊢ assign(lv, rv, k): ¬(LV, lv: Uninit<_>; LC; BC) 950 | ``` 951 | ``` 952 | Call: 953 | TC ⊢ trb* 954 | OB ⊢ lb* 955 | T₀ = for fn(T₁..Tₙ) -> Tᵣ where trb*, lb* 956 | ∀i. 957 | TC; S, LVᵢ; S, LVᵢ₊₁ ⊢ oᵢ : Tᵢ [Tₜₐ/TP]* ['a/'p]* 958 | TC; S; K ⊢ k: ¬(LVₙ₊₁, lv: Tᵣ; LC; OB) 959 | ─────────────────────────────────────────────────────────────────────── 960 | TC; S; K ⊢ call(lv, o*, k): ¬(LV₀, lv: Uninit<_>; LC; OB) 961 | ``` 962 | ``` 963 | Unreachable: 964 | ────────────────────────────────────────────── 965 | TC; S; K ⊢ unreachable: ¬(LV₀, lv: !; LC; BC) 966 | ``` 967 | ``` 968 | CopyDrop: 969 | TC, T: Copy; S; K ⊢ k: ¬(LV, lv: Uninit<_>; LC; BC) 970 | ──────────────────────────────────────────────────────── 971 | TC, T: Copy; S; K ⊢ drop(lv, k): ¬(LV, lv: T; LC; BC) 972 | ``` 973 | ``` 974 | If: 975 | TC; S, LV₀; S, LV₁ ⊢ o: T 976 | TC; S; K ⊢ k₀: ¬(LV₁; LC; BC) 977 | TC; S; K ⊢ k₁: ¬(LV₁; LC; BC) 978 | ────────────────────────────────────────── 979 | TC; S; K ⊢ if(o, k₀, k₁): ¬(LV₀; LC; BC) 980 | ``` 981 | ``` 982 | Switch: 983 | (∪ₙ Tₙ) :> T 984 | ∀i 985 | TC; S; K ⊢ kᵢ: ¬(LV, lv: Tᵢ; LC; BC) 986 | ──────────────────────────────────────────────────── 987 | TC; S; K ⊢ switch(lv, t, k*): ¬(LV, lv: T; LC; BC) 988 | ``` 989 | ``` 990 | LifetimeBegin: 991 | TC; S; K ⊢ k: ¬(LV; LC, 'l; BC) 992 | ──────────────────────────────────────── 993 | TC; S; K ⊢ begin('l, k): ¬(LV; LC; BC) 994 | ``` 995 | ``` 996 | LifetimeEnd: 997 | LV₁ = LV₀[T / BorrowedMut<'l, T>] # Unborrow for this lifetime 998 | 'l ∉ LV₁ 999 | TC; S; K ⊢ k: ¬(LV₁; LC; BC) 1000 | ──────────────────────────────────────────────────────────────── 1001 | TC; S; K ⊢ end('l, k): ¬(LV₀; LC, 'l; BC, {'l: 'a | 'a ∈ LC }) 1002 | ``` 1003 | ``` 1004 | DropUniqRef: 1005 | TC, T; S; K ⊢ k: ¬(LV, lv: Uninit<_>; LC; BC) 1006 | ─────────────────────────────────────────────────────────────── 1007 | TC, T; S; K ⊢ drop(lv, k): ¬(LV, lv: &mut<'a, T, T>; LC; BC) 1008 | ``` 1009 | ``` 1010 | WithDeref: 1011 | TC, T; S; 1012 | { k: ¬(LVᵢ, deref(lv): Tᵢ; LCᵢ; BCᵢ) | i ≥ 1 } 1013 | ⊢ k': ¬(LV₀, deref(lv): T₀; LC₀; BC₀) 1014 | ───────────────────────────────────────────────────────────────────── 1015 | TC, T; S; 1016 | K, { k: ¬(LVᵢ, lv: &mut<'a, Tᵢ, Tᵣ>; LCᵢ; BCᵢ) | i ≥ 1 } 1017 | ⊢ with_deref(k)': ¬(LV₀, lv: &mut<'a, T₀, Tᵣ>; LC₀; BC₀) 1018 | ``` 1019 | 1020 | #### `Fn` Introduction Rules 1021 | ``` 1022 | Fn: 1023 | k₀ = entry 1024 | T₀ = ¬((s: Tₛ)*, (a: Tₚ)*, (l: Uninit<_>)*, ret_slot: Uninit<_>; 'p*, BC) 1025 | ∀i. 1026 | TC; # trait impls 1027 | S; # statics 1028 | l*; # locals (just the location names, no types) 1029 | K, # user labels, K = { kₙ: ¬Tₙ | n } 1030 | exit: ¬((s: Tₛ)*, (a: Uninit<_>)*, (l: Uninit<_>)*, ret_slot: Tᵣ; 'p*, BC); 1031 | ⊢ eᵢ: ¬Tᵢ 1032 | ──────────────────────────────────────────────────────────────────────────────── 1033 | TC; S ⊢ Mir { params, locals, labels: { (k: ¬T = e)* }, .. }: 1034 | for<'p*> fn(Tₚ*) -> Tᵣ where BC 1035 | ``` 1036 | ``` 1037 | FnGeneric: 1038 | TC, TP*, trb*; S ⊢ f: for<'p*> fn(Tₚ*) -> Tᵣ where BC 1039 | ──────────────────────────────────────────────────────────────────── 1040 | TC; S ⊢ (Λ f): for fn(Tₚ*) -> Tᵣ where trb*, BC 1041 | ``` 1042 | 1043 | #### Subtyping 1044 | ``` 1045 | SubRefl: 1046 | ────── 1047 | T <: T 1048 | ``` 1049 | ``` 1050 | SubTrans: 1051 | T₀ <: T₁ 1052 | T₁ <: T₂ 1053 | ────── 1054 | T₀ <: T₂ 1055 | ``` 1056 | ``` 1057 | SubContLValue: 1058 | b <: a 1059 | ──────────────────────────────────────────── 1060 | ¬(LV, lv: a; LC; OB) <: ¬(LV, lv: b; LC; OB) 1061 | ``` 1062 | ``` 1063 | SubContOblig: 1064 | ∀lb ∈ OB₁. OB₀; <> ⊢ lb 1065 | ──────────────────────────────── 1066 | ¬(LV; LC; OB₁) <: ¬(LV; LC; OB₀) 1067 | ``` 1068 | ``` 1069 | SubUniqRef: 1070 | 'a₀ <: 'a₁ 1071 | Tᵢ₀ <: Tᵢ₁ 1072 | Tₒ₀ :> Tₒ₁ # optional contravariance 1073 | ────────────────────────────────────────── 1074 | &mut<'a₀, Tᵢ₀, tₒ₀> <: &mut<'a₁, Tᵢ₁, tₒ₁> 1075 | ``` 1076 | ``` 1077 | SubBorrowedMut: 1078 | 'a₀ :> 'a₁ # optional contravariance 1079 | T₀ <: T₁ 1080 | ────────────────────────────────────── 1081 | BorrowedMut<'a₀, T₀> <: BorrowedMut<'a₁, T₁> 1082 | ``` 1083 | ``` 1084 | SubBorrowUniquely: 1085 | ────────────────────────────────────── 1086 | T <: BorrowedMut<'a, T> 1087 | ``` 1088 | 1089 | #### Outlives 1090 | Changes from [RFC 1214](https://github.com/rust-lang/rfcs/blob/master/text/1214-projections-lifetimes-and-wf.md). 1091 | ``` 1092 | OutlivesUniqRef: 1093 | R ⊢ 'a₀: 'a₁ 1094 | R ⊢ Tᵢ: 'a₁ 1095 | R ⊢ Tₒ: 'a₁ # only if invariant 1096 | ────────────────────────── 1097 | R ⊢ &mut<'a₀, Tᵢ, Tₒ>: 'a₁ 1098 | ``` 1099 | ``` 1100 | OutlivesBorrowedMut: 1101 | R ⊢ T: 'a₁ 1102 | ───────────────────────── 1103 | R ⊢ BorrowedMut<'a₀, T>: 'a₁ 1104 | ``` 1105 | 1106 | #### Well-Formedness 1107 | Changes from [RFC 1214](https://github.com/rust-lang/rfcs/blob/master/text/1214-projections-lifetimes-and-wf.md). 1108 | ``` 1109 | WfUniqRef: 1110 | R ⊢ Tᵢ WF 1111 | R ⊢ Tₒ WF 1112 | ─────────────────────── 1113 | R ⊢ &mut<'a, Tᵢ, Tₒ> WF 1114 | ``` 1115 | ``` 1116 | WfBorrowedMut: 1117 | R ⊢ T WF 1118 | R ⊢ T: 'a 1119 | ───────────────────────── 1120 | R ⊢ BorrowedMut<'a, T> WF 1121 | ``` 1122 | --------------------------------------------------------------------------------