├── .gitignore ├── src ├── arc-mutex │ ├── arc-and-mutex.md │ ├── arc.md │ ├── arc-final.md │ ├── arc-layout.md │ ├── arc-clone.md │ ├── arc-drop.md │ └── arc-base.md ├── uninitialized.md ├── vec │ ├── vec.md │ ├── vec-dealloc.md │ ├── vec-deref.md │ ├── vec-insert-remove.md │ ├── vec-layout.md │ ├── vec-push-pop.md │ ├── vec-raw.md │ ├── vec-drain.md │ ├── vec-into-iter.md │ ├── vec-zsts.md │ └── vec-alloc.md ├── data.md ├── concurrency.md ├── references.md ├── obrm.md ├── conversions.md ├── coercions.md ├── casts.md ├── unbounded-lifetimes.md ├── poisoning.md ├── panic-handler.md ├── constructors.md ├── SUMMARY.md ├── lifetime-elision.md ├── ownership.md ├── hrtb.md ├── unwinding.md ├── transmutes.md ├── beneath-std.md ├── intro.md ├── checked-uninit.md ├── drop-flags.md ├── meet-safe-and-unsafe.md ├── lifetime-mismatch.md ├── races.md ├── working-with-unsafe.md ├── what-unsafe-does.md ├── repr-rust.md ├── dot-operator.md ├── aliasing.md ├── destructors.md ├── unchecked-uninit.md ├── exotic-sizes.md ├── exception-safety.md ├── other-reprs.md ├── borrow-splitting.md └── safe-unsafe-meaning.md ├── CITATION.cff ├── theme └── nomicon.css ├── LICENSE-MIT ├── book.toml ├── .github └── workflows │ └── main.yml └── README.md /.gitignore: -------------------------------------------------------------------------------- 1 | *.html 2 | book 3 | 4 | # linkcheck stuff 5 | linkcheck 6 | linkchecker 7 | linkcheck.sh 8 | -------------------------------------------------------------------------------- /src/arc-mutex/arc-and-mutex.md: -------------------------------------------------------------------------------- 1 | # Implementing Arc and Mutex 2 | 3 | Knowing the theory is all fine and good, but the *best* way to understand 4 | something is to use it. To better understand atomics and interior mutability, 5 | we'll be implementing versions of the standard library's `Arc` and `Mutex` types. 6 | 7 | TODO: Write `Mutex` chapters. 8 | -------------------------------------------------------------------------------- /CITATION.cff: -------------------------------------------------------------------------------- 1 | cff-version: 1.2.0 2 | message: If you use this book, please cite it using these metadata. 3 | title: The Rustonomicon 4 | abstract: The Dark Arts of Advanced and Unsafe Rust Programming 5 | authors: 6 | - name: "The Rust Project Developers" 7 | date-released: "2017-03-03" 8 | license: "MIT OR Apache-2.0" 9 | repository-code: "https://github.com/rust-lang/nomicon" 10 | -------------------------------------------------------------------------------- /src/uninitialized.md: -------------------------------------------------------------------------------- 1 | # Working With Uninitialized Memory 2 | 3 | All runtime-allocated memory in a Rust program begins its life as 4 | *uninitialized*. In this state the value of the memory is an indeterminate pile 5 | of bits that may or may not even reflect a valid state for the type that is 6 | supposed to inhabit that location of memory. Attempting to interpret this memory 7 | as a value of *any* type will cause Undefined Behavior. Do Not Do This. 8 | 9 | Rust provides mechanisms to work with uninitialized memory in checked (safe) and 10 | unchecked (unsafe) ways. 11 | -------------------------------------------------------------------------------- /src/arc-mutex/arc.md: -------------------------------------------------------------------------------- 1 | # Implementing Arc 2 | 3 | In this section, we'll be implementing a simpler version of `std::sync::Arc`. 4 | Similarly to [the implementation of `Vec` we made earlier](../vec/vec.md), we won't be 5 | taking advantage of as many optimizations, intrinsics, or unstable code as the 6 | standard library may. 7 | 8 | This implementation is loosely based on the standard library's implementation 9 | (technically taken from `alloc::sync` in 1.49, as that's where it's actually 10 | implemented), but it will not support weak references at the moment as they 11 | make the implementation slightly more complex. 12 | 13 | Please note that this section is very work-in-progress at the moment. 14 | -------------------------------------------------------------------------------- /src/vec/vec.md: -------------------------------------------------------------------------------- 1 | # Example: Implementing Vec 2 | 3 | To bring everything together, we're going to write `std::Vec` from scratch. 4 | We will limit ourselves to stable Rust. In particular we won't use any 5 | intrinsics that could make our code a little bit nicer or efficient because 6 | intrinsics are permanently unstable. Although many intrinsics *do* become 7 | stabilized elsewhere (`std::ptr` and `std::mem` consist of many intrinsics). 8 | 9 | Ultimately this means our implementation may not take advantage of all 10 | possible optimizations, though it will be by no means *naive*. We will 11 | definitely get into the weeds over nitty-gritty details, even 12 | when the problem doesn't *really* merit it. 13 | 14 | You wanted advanced. We're gonna go advanced. 15 | -------------------------------------------------------------------------------- /src/data.md: -------------------------------------------------------------------------------- 1 | # Data Representation in Rust 2 | 3 | Low-level programming cares a lot about data layout. It's a big deal. It also 4 | pervasively influences the rest of the language, so we're going to start by 5 | digging into how data is represented in Rust. 6 | 7 | This chapter is ideally in agreement with, and rendered redundant by, 8 | the [Type Layout section of the Reference][ref-type-layout]. When this 9 | book was first written, the reference was in complete disrepair, and the 10 | Rustonomicon was attempting to serve as a partial replacement for the reference. 11 | This is no longer the case, so this whole chapter can ideally be deleted. 12 | 13 | We'll keep this chapter around for a bit longer, but ideally you should be 14 | contributing any new facts or improvements to the Reference instead. 15 | 16 | [ref-type-layout]: ../reference/type-layout.html 17 | -------------------------------------------------------------------------------- /src/concurrency.md: -------------------------------------------------------------------------------- 1 | # Concurrency and Parallelism 2 | 3 | Rust as a language doesn't *really* have an opinion on how to do concurrency or 4 | parallelism. The standard library exposes OS threads and blocking sys-calls 5 | because everyone has those, and they're uniform enough that you can provide 6 | an abstraction over them in a relatively uncontroversial way. Message passing, 7 | green threads, and async APIs are all diverse enough that any abstraction over 8 | them tends to involve trade-offs that we weren't willing to commit to for 1.0. 9 | 10 | However the way Rust models concurrency makes it relatively easy to design your own 11 | concurrency paradigm as a library and have everyone else's code Just Work 12 | with yours. Just require the right lifetimes and Send and Sync where appropriate 13 | and you're off to the races. Or rather, off to the... not... having... races. 14 | -------------------------------------------------------------------------------- /theme/nomicon.css: -------------------------------------------------------------------------------- 1 | /* 2 | Taken from the reference. 3 | Warnings and notes: 4 | Write the
s on their own line. E.g. 5 |
6 | Warning: This is bad! 7 |
8 | */ 9 | main .warning p { 10 | padding: 10px 20px; 11 | margin: 20px 0; 12 | } 13 | 14 | main .warning p::before { 15 | content: "⚠️ "; 16 | } 17 | 18 | .light main .warning p, 19 | .rust main .warning p { 20 | border: 2px solid red; 21 | background: #ffcece; 22 | } 23 | 24 | .rust main .warning p { 25 | /* overrides previous declaration */ 26 | border-color: #961717; 27 | } 28 | 29 | .coal main .warning p, 30 | .navy main .warning p, 31 | .ayu main .warning p { 32 | background: #542626 33 | } 34 | 35 | /* Make the links higher contrast on dark themes */ 36 | .coal main .warning p a, 37 | .navy main .warning p a, 38 | .ayu main .warning p a { 39 | color: #80d0d0 40 | } 41 | -------------------------------------------------------------------------------- /src/references.md: -------------------------------------------------------------------------------- 1 | # References 2 | 3 | There are two kinds of references: 4 | 5 | * Shared reference: `&` 6 | * Mutable reference: `&mut` 7 | 8 | Which obey the following rules: 9 | 10 | * A reference cannot outlive its referent 11 | * A mutable reference cannot be aliased 12 | 13 | That's it. That's the whole model references follow. 14 | 15 | Of course, we should probably define what *aliased* means. 16 | 17 | ```text 18 | error[E0425]: cannot find value `aliased` in this scope 19 | --> :2:20 20 | | 21 | 2 | println!("{}", aliased); 22 | | ^^^^^^^ not found in this scope 23 | 24 | error: aborting due to previous error 25 | ``` 26 | 27 | Unfortunately, Rust hasn't actually defined its aliasing model. 🙀 28 | 29 | While we wait for the Rust devs to specify the semantics of their language, 30 | let's use the next section to discuss what aliasing is in general, and why it 31 | matters. 32 | -------------------------------------------------------------------------------- /src/obrm.md: -------------------------------------------------------------------------------- 1 | # The Perils Of Ownership Based Resource Management (OBRM) 2 | 3 | OBRM (AKA RAII: Resource Acquisition Is Initialization) is something you'll 4 | interact with a lot in Rust. Especially if you use the standard library. 5 | 6 | Roughly speaking the pattern is as follows: to acquire a resource, you create an 7 | object that manages it. To release the resource, you simply destroy the object, 8 | and it cleans up the resource for you. The most common "resource" this pattern 9 | manages is simply *memory*. `Box`, `Rc`, and basically everything in 10 | `std::collections` is a convenience to enable correctly managing memory. This is 11 | particularly important in Rust because we have no pervasive GC to rely on for 12 | memory management. Which is the point, really: Rust is about control. However we 13 | are not limited to just memory. Pretty much every other system resource like a 14 | thread, file, or socket is exposed through this kind of API. 15 | -------------------------------------------------------------------------------- /src/vec/vec-dealloc.md: -------------------------------------------------------------------------------- 1 | # Deallocating 2 | 3 | Next we should implement Drop so that we don't massively leak tons of resources. 4 | The easiest way is to just call `pop` until it yields None, and then deallocate 5 | our buffer. Note that calling `pop` is unneeded if `T: !Drop`. In theory we can 6 | ask Rust if `T` `needs_drop` and omit the calls to `pop`. However in practice 7 | LLVM is *really* good at removing simple side-effect free code like this, so I 8 | wouldn't bother unless you notice it's not being stripped (in this case it is). 9 | 10 | We must not call `alloc::dealloc` when `self.cap == 0`, as in this case we 11 | haven't actually allocated any memory. 12 | 13 | 14 | ```rust,ignore 15 | impl Drop for Vec { 16 | fn drop(&mut self) { 17 | if self.cap != 0 { 18 | while let Some(_) = self.pop() { } 19 | let layout = Layout::array::(self.cap).unwrap(); 20 | unsafe { 21 | alloc::dealloc(self.ptr.as_ptr() as *mut u8, layout); 22 | } 23 | } 24 | } 25 | } 26 | ``` 27 | -------------------------------------------------------------------------------- /src/conversions.md: -------------------------------------------------------------------------------- 1 | # Type Conversions 2 | 3 | At the end of the day, everything is just a pile of bits somewhere, and type 4 | systems are just there to help us use those bits right. There are two common 5 | problems with typing bits: needing to reinterpret those exact bits as a 6 | different type, and needing to change the bits to have equivalent meaning for 7 | a different type. Because Rust encourages encoding important properties in the 8 | type system, these problems are incredibly pervasive. As such, Rust 9 | consequently gives you several ways to solve them. 10 | 11 | First we'll look at the ways that Safe Rust gives you to reinterpret values. 12 | The most trivial way to do this is to just destructure a value into its 13 | constituent parts and then build a new type out of them. e.g. 14 | 15 | ```rust 16 | struct Foo { 17 | x: u32, 18 | y: u16, 19 | } 20 | 21 | struct Bar { 22 | a: u32, 23 | b: u16, 24 | } 25 | 26 | fn reinterpret(foo: Foo) -> Bar { 27 | let Foo { x, y } = foo; 28 | Bar { a: x, b: y } 29 | } 30 | ``` 31 | 32 | But this is, at best, annoying. For common conversions, Rust provides 33 | more ergonomic alternatives. 34 | -------------------------------------------------------------------------------- /LICENSE-MIT: -------------------------------------------------------------------------------- 1 | Copyright (c) 2010 The Rust Project Developers 2 | 3 | Permission is hereby granted, free of charge, to any 4 | person obtaining a copy of this software and associated 5 | documentation files (the "Software"), to deal in the 6 | Software without restriction, including without 7 | limitation the rights to use, copy, modify, merge, 8 | publish, distribute, sublicense, and/or sell copies of 9 | the Software, and to permit persons to whom the Software 10 | is furnished to do so, subject to the following 11 | conditions: 12 | 13 | The above copyright notice and this permission notice 14 | shall be included in all copies or substantial portions 15 | of the Software. 16 | 17 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF 18 | ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED 19 | TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A 20 | PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT 21 | SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY 22 | CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION 23 | OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR 24 | IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 25 | DEALINGS IN THE SOFTWARE. 26 | -------------------------------------------------------------------------------- /book.toml: -------------------------------------------------------------------------------- 1 | [book] 2 | authors = ["The Rust Project Developers"] 3 | title = "The Rustonomicon" 4 | description = "The Dark Arts of Advanced and Unsafe Rust Programming" 5 | 6 | [output.html] 7 | additional-css = ["theme/nomicon.css"] 8 | git-repository-url = "https://github.com/rust-lang/nomicon" 9 | 10 | [output.html.redirect] 11 | # Vec-related chapters. 12 | "./vec-alloc.html" = "./vec/vec-alloc.html" 13 | "./vec-dealloc.html" = "./vec/vec-dealloc.html" 14 | "./vec-deref.html" = "./vec/vec-deref.html" 15 | "./vec-drain.html" = "./vec/vec-drain.html" 16 | "./vec-final.html" = "./vec/vec-final.html" 17 | "./vec-insert-remove.html" = "./vec/vec-insert-remove.html" 18 | "./vec-into-iter.html" = "./vec/vec-into-iter.html" 19 | "./vec-layout.html" = "./vec/vec-layout.html" 20 | "./vec-push-pop.html" = "./vec/vec-push-pop.html" 21 | "./vec-raw.html" = "./vec/vec-raw.html" 22 | "./vec-zsts.html" = "./vec/vec-zsts.html" 23 | "./vec.html" = "./vec/vec.html" 24 | 25 | # Arc and Mutex related chapters. 26 | "./arc-and-mutex.html" = "./arc-mutex/arc-and-mutex.html" 27 | "./arc-base.html" = "./arc-mutex/arc-base.html" 28 | "./arc-clone.html" = "./arc-mutex/arc-clone.html" 29 | "./arc-drop.html" = "./arc-mutex/arc-drop.html" 30 | "./arc-final.html" = "./arc-mutex/arc-final.html" 31 | "./arc-layout.html" = "./arc-mutex/arc-layout.html" 32 | "./arc.html" = "./arc-mutex/arc.html" 33 | 34 | [rust] 35 | edition = "2024" 36 | -------------------------------------------------------------------------------- /src/vec/vec-deref.md: -------------------------------------------------------------------------------- 1 | # Deref 2 | 3 | Alright! We've got a decent minimal stack implemented. We can push, we can 4 | pop, and we can clean up after ourselves. However there's a whole mess of 5 | functionality we'd reasonably want. In particular, we have a proper array, but 6 | none of the slice functionality. That's actually pretty easy to solve: we can 7 | implement `Deref`. This will magically make our Vec coerce to, and 8 | behave like, a slice in all sorts of conditions. 9 | 10 | All we need is `slice::from_raw_parts`. It will correctly handle empty slices 11 | for us. Later once we set up zero-sized type support it will also Just Work 12 | for those too. 13 | 14 | 15 | ```rust,ignore 16 | use std::ops::Deref; 17 | 18 | impl Deref for Vec { 19 | type Target = [T]; 20 | fn deref(&self) -> &[T] { 21 | unsafe { 22 | std::slice::from_raw_parts(self.ptr.as_ptr(), self.len) 23 | } 24 | } 25 | } 26 | ``` 27 | 28 | And let's do DerefMut too: 29 | 30 | 31 | ```rust,ignore 32 | use std::ops::DerefMut; 33 | 34 | impl DerefMut for Vec { 35 | fn deref_mut(&mut self) -> &mut [T] { 36 | unsafe { 37 | std::slice::from_raw_parts_mut(self.ptr.as_ptr(), self.len) 38 | } 39 | } 40 | } 41 | ``` 42 | 43 | Now we have `len`, `first`, `last`, indexing, slicing, sorting, `iter`, 44 | `iter_mut`, and all other sorts of bells and whistles provided by slice. Sweet! 45 | -------------------------------------------------------------------------------- /src/coercions.md: -------------------------------------------------------------------------------- 1 | # Coercions 2 | 3 | Types can implicitly be coerced to change in certain contexts. 4 | These changes are generally just *weakening* of types, largely focused around pointers and lifetimes. 5 | They mostly exist to make Rust "just work" in more cases, and are largely harmless. 6 | 7 | For an exhaustive list of all the types of coercions, see the [Coercion types] section on the reference. 8 | 9 | Note that we do not perform coercions when matching traits (except for receivers, see the [next page][dot-operator]). 10 | If there is an `impl` for some type `U` and `T` coerces to `U`, that does not constitute an implementation for `T`. 11 | For example, the following will not type check, even though it is OK to coerce `t` to `&T` and there is an `impl` for `&T`: 12 | 13 | ```rust,compile_fail 14 | trait Trait {} 15 | 16 | fn foo(t: X) {} 17 | 18 | impl<'a> Trait for &'a i32 {} 19 | 20 | fn main() { 21 | let t: &mut i32 = &mut 0; 22 | foo(t); 23 | } 24 | ``` 25 | 26 | which fails like as follows: 27 | 28 | ```text 29 | error[E0277]: the trait bound `&mut i32: Trait` is not satisfied 30 | --> src/main.rs:9:9 31 | | 32 | 3 | fn foo(t: X) {} 33 | | ----- required by this bound in `foo` 34 | ... 35 | 9 | foo(t); 36 | | ^ the trait `Trait` is not implemented for `&mut i32` 37 | | 38 | = help: the following implementations were found: 39 | <&'a i32 as Trait> 40 | = note: `Trait` is implemented for `&i32`, but not for `&mut i32` 41 | ``` 42 | 43 | [Coercion types]: ../reference/type-coercions.html#coercion-types 44 | [dot-operator]: ./dot-operator.html 45 | -------------------------------------------------------------------------------- /src/casts.md: -------------------------------------------------------------------------------- 1 | # Casts 2 | 3 | Casts are a superset of coercions: every coercion can be explicitly invoked via a cast. 4 | However some conversions require a cast. 5 | While coercions are pervasive and largely harmless, these "true casts" are rare and potentially dangerous. 6 | As such, casts must be explicitly invoked using the `as` keyword: `expr as Type`. 7 | 8 | You can find an exhaustive list of [all the true casts][cast list] and [casting semantics][semantics list] on the reference. 9 | 10 | ## Safety of casting 11 | 12 | True casts generally revolve around raw pointers and the primitive numeric types. 13 | Even though they're dangerous, these casts are infallible at runtime. 14 | If a cast triggers some subtle corner case no indication will be given that this occurred. 15 | The cast will simply succeed. 16 | That said, casts must be valid at the type level, or else they will be prevented statically. 17 | For instance, `7u8 as bool` will not compile. 18 | 19 | That said, casts aren't `unsafe` because they generally can't violate memory safety *on their own*. 20 | For instance, converting an integer to a raw pointer can very easily lead to terrible things. 21 | However the act of creating the pointer itself is safe, because actually using a raw pointer is already marked as `unsafe`. 22 | 23 | ## Some notes about casting 24 | 25 | ### Lengths when casting raw slices 26 | 27 | Note that lengths are not adjusted when casting raw slices; `*const [u16] as *const [u8]` creates a slice that only includes half of the original memory. 28 | 29 | ### Transitivity 30 | 31 | Casting is not transitive, that is, even if `e as U1 as U2` is a valid expression, `e as U2` is not necessarily so. 32 | 33 | [cast list]: ../reference/expressions/operator-expr.html#type-cast-expressions 34 | [semantics list]: ../reference/expressions/operator-expr.html#semantics 35 | -------------------------------------------------------------------------------- /.github/workflows/main.yml: -------------------------------------------------------------------------------- 1 | name: CI 2 | on: 3 | pull_request: 4 | merge_group: 5 | 6 | env: 7 | MDBOOK_VERSION: 0.5.1 8 | 9 | jobs: 10 | test: 11 | name: Test 12 | runs-on: ubuntu-latest 13 | steps: 14 | - uses: actions/checkout@v4 15 | - name: Update rustup 16 | run: rustup self update 17 | - name: Install Rust 18 | run: | 19 | rustup set profile minimal 20 | rustup toolchain install nightly -c rust-docs 21 | rustup default nightly 22 | - name: Install mdbook 23 | run: | 24 | mkdir bin 25 | curl -sSL https://github.com/rust-lang/mdBook/releases/download/v${MDBOOK_VERSION}/mdbook-v${MDBOOK_VERSION}-x86_64-unknown-linux-gnu.tar.gz | tar -xz --directory=bin 26 | echo "$(pwd)/bin" >> $GITHUB_PATH 27 | - name: Report versions 28 | run: | 29 | rustup --version 30 | rustc -Vv 31 | mdbook --version 32 | - name: Run tests 33 | run: mdbook test 34 | - name: Check for broken links 35 | run: | 36 | curl -sSLo linkcheck.sh \ 37 | https://raw.githubusercontent.com/rust-lang/rust/master/src/tools/linkchecker/linkcheck.sh 38 | sh linkcheck.sh --all nomicon 39 | 40 | # The success job is here to consolidate the total success/failure state of 41 | # all other jobs. This job is then included in the GitHub branch protection 42 | # rule which prevents merges unless all other jobs are passing. This makes 43 | # it easier to manage the list of jobs via this yml file and to prevent 44 | # accidentally adding new jobs without also updating the branch protections. 45 | success: 46 | name: Success gate 47 | if: always() 48 | needs: 49 | - test 50 | runs-on: ubuntu-latest 51 | steps: 52 | - run: jq --exit-status 'all(.result == "success")' <<< '${{ toJson(needs) }}' 53 | - name: Done 54 | run: exit 0 55 | -------------------------------------------------------------------------------- /src/vec/vec-insert-remove.md: -------------------------------------------------------------------------------- 1 | # Insert and Remove 2 | 3 | Something *not* provided by slice is `insert` and `remove`, so let's do those 4 | next. 5 | 6 | Insert needs to shift all the elements at the target index to the right by one. 7 | To do this we need to use `ptr::copy`, which is our version of C's `memmove`. 8 | This copies some chunk of memory from one location to another, correctly 9 | handling the case where the source and destination overlap (which will 10 | definitely happen here). 11 | 12 | If we insert at index `i`, we want to shift the `[i .. len]` to `[i+1 .. len+1]` 13 | using the old len. 14 | 15 | 16 | ```rust,ignore 17 | pub fn insert(&mut self, index: usize, elem: T) { 18 | // Note: `<=` because it's valid to insert after everything 19 | // which would be equivalent to push. 20 | assert!(index <= self.len, "index out of bounds"); 21 | if self.len == self.cap { self.grow(); } 22 | 23 | unsafe { 24 | // ptr::copy(src, dest, len): "copy from src to dest len elems" 25 | ptr::copy( 26 | self.ptr.as_ptr().add(index), 27 | self.ptr.as_ptr().add(index + 1), 28 | self.len - index, 29 | ); 30 | ptr::write(self.ptr.as_ptr().add(index), elem); 31 | } 32 | 33 | self.len += 1; 34 | } 35 | ``` 36 | 37 | Remove behaves in the opposite manner. We need to shift all the elements from 38 | `[i+1 .. len + 1]` to `[i .. len]` using the *new* len. 39 | 40 | 41 | ```rust,ignore 42 | pub fn remove(&mut self, index: usize) -> T { 43 | // Note: `<` because it's *not* valid to remove after everything 44 | assert!(index < self.len, "index out of bounds"); 45 | unsafe { 46 | self.len -= 1; 47 | let result = ptr::read(self.ptr.as_ptr().add(index)); 48 | ptr::copy( 49 | self.ptr.as_ptr().add(index + 1), 50 | self.ptr.as_ptr().add(index), 51 | self.len - index, 52 | ); 53 | result 54 | } 55 | } 56 | ``` 57 | -------------------------------------------------------------------------------- /src/unbounded-lifetimes.md: -------------------------------------------------------------------------------- 1 | # Unbounded Lifetimes 2 | 3 | Unsafe code can often end up producing references or lifetimes out of thin air. 4 | Such lifetimes come into the world as *unbounded*. The most common source of 5 | this is taking a reference to a dereferenced raw pointer, which produces a 6 | reference with an unbounded lifetime. Such a lifetime becomes as big as context 7 | demands. This is in fact more powerful than simply becoming `'static`, because 8 | for instance `&'static &'a T` will fail to typecheck, but the unbound lifetime 9 | will perfectly mold into `&'a &'a T` as needed. However for most intents and 10 | purposes, such an unbounded lifetime can be regarded as `'static`. 11 | 12 | Almost no reference is `'static`, so this is probably wrong. `transmute` and 13 | `transmute_copy` are the two other primary offenders. One should endeavor to 14 | bound an unbounded lifetime as quickly as possible, especially across function 15 | boundaries. 16 | 17 | Given a function, any output lifetimes that don't derive from inputs are 18 | unbounded. For instance: 19 | 20 | 21 | ```rust,no_run 22 | fn get_str<'a>(s: *const String) -> &'a str { 23 | unsafe { &*s } 24 | } 25 | 26 | fn main() { 27 | let soon_dropped = String::from("hello"); 28 | let dangling = get_str(&soon_dropped); 29 | drop(soon_dropped); 30 | println!("Invalid str: {}", dangling); // Invalid str: gӚ_` 31 | } 32 | ``` 33 | 34 | The easiest way to avoid unbounded lifetimes is to use lifetime elision at the 35 | function boundary. If an output lifetime is elided, then it *must* be bounded by 36 | an input lifetime. Of course it might be bounded by the *wrong* lifetime, but 37 | this will usually just cause a compiler error, rather than allow memory safety 38 | to be trivially violated. 39 | 40 | Within a function, bounding lifetimes is more error-prone. The safest and easiest 41 | way to bound a lifetime is to return it from a function with a bound lifetime. 42 | However if this is unacceptable, the reference can be placed in a location with 43 | a specific lifetime. Unfortunately it's impossible to name all lifetimes involved 44 | in a function. 45 | -------------------------------------------------------------------------------- /src/vec/vec-layout.md: -------------------------------------------------------------------------------- 1 | # Layout 2 | 3 | First off, we need to come up with the struct layout. A Vec has three parts: 4 | a pointer to the allocation, the size of the allocation, and the number of 5 | elements that have been initialized. 6 | 7 | Naively, this means we just want this design: 8 | 9 | 10 | ```rust,ignore 11 | pub struct Vec { 12 | ptr: *mut T, 13 | cap: usize, 14 | len: usize, 15 | } 16 | ``` 17 | 18 | And indeed this would compile. Unfortunately, it would be too strict. The 19 | compiler will give us too strict variance. So a `&Vec<&'static str>` 20 | couldn't be used where a `&Vec<&'a str>` was expected. See [the chapter 21 | on ownership and lifetimes][ownership] for all the details on variance. 22 | 23 | As we saw in the ownership chapter, the standard library uses `Unique` in place of 24 | `*mut T` when it has a raw pointer to an allocation that it owns. Unique is unstable, 25 | so we'd like to not use it if possible, though. 26 | 27 | As a recap, Unique is a wrapper around a raw pointer that declares that: 28 | 29 | * We are covariant over `T` 30 | * We may own a value of type `T` (this is not relevant for our example here, but see 31 | [the chapter on PhantomData][phantom-data] on why the real `std::vec::Vec` needs this) 32 | * We are Send/Sync if `T` is Send/Sync 33 | * Our pointer is never null (so `Option>` is null-pointer-optimized) 34 | 35 | We can implement all of the above requirements in stable Rust. To do this, instead 36 | of using `Unique` we will use [`NonNull`][NonNull], another wrapper around a 37 | raw pointer, which gives us two of the above properties, namely it is covariant 38 | over `T` and is declared to never be null. By implementing Send/Sync if `T` is, 39 | we get the same results as using `Unique`: 40 | 41 | ```rust 42 | use std::ptr::NonNull; 43 | 44 | pub struct Vec { 45 | ptr: NonNull, 46 | cap: usize, 47 | len: usize, 48 | } 49 | 50 | unsafe impl Send for Vec {} 51 | unsafe impl Sync for Vec {} 52 | # fn main() {} 53 | ``` 54 | 55 | [ownership]: ../ownership.html 56 | [phantom-data]: ../phantom-data.md 57 | [NonNull]: ../../std/ptr/struct.NonNull.html 58 | -------------------------------------------------------------------------------- /src/poisoning.md: -------------------------------------------------------------------------------- 1 | # Poisoning 2 | 3 | Although all unsafe code *must* ensure it has minimal exception safety, not all 4 | types ensure *maximal* exception safety. Even if the type does, your code may 5 | ascribe additional meaning to it. For instance, an integer is certainly 6 | exception-safe, but has no semantics on its own. It's possible that code that 7 | panics could fail to correctly update the integer, producing an inconsistent 8 | program state. 9 | 10 | This is *usually* fine, because anything that witnesses an exception is about 11 | to get destroyed. For instance, if you send a Vec to another thread and that 12 | thread panics, it doesn't matter if the Vec is in a weird state. It will be 13 | dropped and go away forever. However some types are especially good at smuggling 14 | values across the panic boundary. 15 | 16 | These types may choose to explicitly *poison* themselves if they witness a panic. 17 | Poisoning doesn't entail anything in particular. Generally it just means 18 | preventing normal usage from proceeding. The most notable example of this is the 19 | standard library's Mutex type. A Mutex will poison itself if one of its 20 | MutexGuards (the thing it returns when a lock is obtained) is dropped during a 21 | panic. Any future attempts to lock the Mutex will return an `Err` or panic. 22 | 23 | Mutex poisons not for true safety in the sense that Rust normally cares about. It 24 | poisons as a safety-guard against blindly using the data that comes out of a Mutex 25 | that has witnessed a panic while locked. The data in such a Mutex was likely in the 26 | middle of being modified, and as such may be in an inconsistent or incomplete state. 27 | It is important to note that one cannot violate memory safety with such a type 28 | if it is correctly written. After all, it must be minimally exception-safe! 29 | 30 | However if the Mutex contained, say, a BinaryHeap that does not actually have the 31 | heap property, it's unlikely that any code that uses it will do 32 | what the author intended. As such, the program should not proceed normally. 33 | Still, if you're double-plus-sure that you can do *something* with the value, 34 | the Mutex exposes a method to get the lock anyway. It *is* safe, after all. 35 | Just maybe nonsense. 36 | -------------------------------------------------------------------------------- /src/vec/vec-push-pop.md: -------------------------------------------------------------------------------- 1 | # Push and Pop 2 | 3 | Alright. We can initialize. We can allocate. Let's actually implement some 4 | functionality! Let's start with `push`. All it needs to do is check if we're 5 | full to grow, unconditionally write to the next index, and then increment our 6 | length. 7 | 8 | To do the write we have to be careful not to evaluate the memory we want to write 9 | to. At worst, it's truly uninitialized memory from the allocator. At best it's the 10 | bits of some old value we popped off. Either way, we can't just index to the memory 11 | and dereference it, because that will evaluate the memory as a valid instance of 12 | T. Worse, `foo[idx] = x` will try to call `drop` on the old value of `foo[idx]`! 13 | 14 | The correct way to do this is with `ptr::write`, which just blindly overwrites the 15 | target address with the bits of the value we provide. No evaluation involved. 16 | 17 | For `push`, if the old len (before push was called) is 0, then we want to write 18 | to the 0th index. So we should offset by the old len. 19 | 20 | 21 | ```rust,ignore 22 | pub fn push(&mut self, elem: T) { 23 | if self.len == self.cap { self.grow(); } 24 | 25 | unsafe { 26 | ptr::write(self.ptr.as_ptr().add(self.len), elem); 27 | } 28 | 29 | // Can't fail, we'll OOM first. 30 | self.len += 1; 31 | } 32 | ``` 33 | 34 | Easy! How about `pop`? Although this time the index we want to access is 35 | initialized, Rust won't just let us dereference the location of memory to move 36 | the value out, because that would leave the memory uninitialized! For this we 37 | need `ptr::read`, which just copies out the bits from the target address and 38 | interprets it as a value of type T. This will leave the memory at this address 39 | logically uninitialized, even though there is in fact a perfectly good instance 40 | of T there. 41 | 42 | For `pop`, if the old len is 1, for example, we want to read out of the 0th 43 | index. So we should offset by the new len. 44 | 45 | 46 | ```rust,ignore 47 | pub fn pop(&mut self) -> Option { 48 | if self.len == 0 { 49 | None 50 | } else { 51 | self.len -= 1; 52 | unsafe { 53 | Some(ptr::read(self.ptr.as_ptr().add(self.len))) 54 | } 55 | } 56 | } 57 | ``` 58 | -------------------------------------------------------------------------------- /src/panic-handler.md: -------------------------------------------------------------------------------- 1 | # #[panic_handler] 2 | 3 | `#[panic_handler]` is used to define the behavior of `panic!` in `#![no_std]` applications. 4 | The `#[panic_handler]` attribute must be applied to a function with signature `fn(&PanicInfo) 5 | -> !` and such function must appear *once* in the dependency graph of a binary / dylib / cdylib 6 | crate. The API of `PanicInfo` can be found in the [API docs]. 7 | 8 | [API docs]: ../core/panic/struct.PanicInfo.html 9 | 10 | Given that `#![no_std]` applications have no *standard* output and that some `#![no_std]` 11 | applications, e.g. embedded applications, need different panicking behaviors for development and for 12 | release it can be helpful to have panic crates, crate that only contain a `#[panic_handler]`. 13 | This way applications can easily swap the panicking behavior by simply linking to a different panic 14 | crate. 15 | 16 | Below is shown an example where an application has a different panicking behavior depending on 17 | whether is compiled using the dev profile (`cargo build`) or using the release profile (`cargo build 18 | --release`). 19 | 20 | `panic-semihosting` crate -- log panic messages to the host stderr using semihosting: 21 | 22 | 23 | ```rust,ignore 24 | #![no_std] 25 | 26 | use core::fmt::{Write, self}; 27 | use core::panic::PanicInfo; 28 | 29 | struct HStderr { 30 | // .. 31 | # _0: (), 32 | } 33 | # 34 | # impl HStderr { 35 | # fn new() -> HStderr { HStderr { _0: () } } 36 | # } 37 | # 38 | # impl fmt::Write for HStderr { 39 | # fn write_str(&mut self, _: &str) -> fmt::Result { Ok(()) } 40 | # } 41 | 42 | #[panic_handler] 43 | fn panic(info: &PanicInfo) -> ! { 44 | let mut host_stderr = HStderr::new(); 45 | 46 | // logs "panicked at '$reason', src/main.rs:27:4" to the host stderr 47 | writeln!(host_stderr, "{}", info).ok(); 48 | 49 | loop {} 50 | } 51 | ``` 52 | 53 | `panic-halt` crate -- halt the thread on panic; messages are discarded: 54 | 55 | 56 | ```rust,ignore 57 | #![no_std] 58 | 59 | use core::panic::PanicInfo; 60 | 61 | #[panic_handler] 62 | fn panic(_info: &PanicInfo) -> ! { 63 | loop {} 64 | } 65 | ``` 66 | 67 | `app` crate: 68 | 69 | 70 | ```rust,ignore 71 | #![no_std] 72 | 73 | // dev profile 74 | #[cfg(debug_assertions)] 75 | extern crate panic_semihosting; 76 | 77 | // release profile 78 | #[cfg(not(debug_assertions))] 79 | extern crate panic_halt; 80 | 81 | fn main() { 82 | // .. 83 | } 84 | ``` 85 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # The Rustonomicon 2 | 3 | The Dark Arts of Advanced and Unsafe Rust Programming 4 | 5 | Nicknamed "the Nomicon." 6 | 7 | ## NOTE: This is a draft document, and may contain serious errors 8 | 9 | > Instead of the programs I had hoped for, there came only a shuddering 10 | blackness and ineffable loneliness; and I saw at last a fearful truth which no 11 | one had ever dared to breathe before — the unwhisperable secret of secrets — The 12 | fact that this language of stone and stridor is not a sentient perpetuation of 13 | Rust as London is of Old London and Paris of Old Paris, but that it is in fact 14 | quite unsafe, its sprawling body imperfectly embalmed and infested with queer 15 | animate things which have nothing to do with it as it was in compilation. 16 | 17 | This book digs into all the awful details that are necessary to understand in 18 | order to write correct Unsafe Rust programs. Due to the nature of this problem, 19 | it may lead to unleashing untold horrors that shatter your psyche into a billion 20 | infinitesimal fragments of despair. 21 | 22 | ## Requirements 23 | 24 | Building the Nomicon requires [mdBook]. To get it: 25 | 26 | [mdBook]: https://github.com/rust-lang/mdBook 27 | 28 | ```bash 29 | cargo install mdbook 30 | ``` 31 | 32 | ### `mdbook` usage 33 | 34 | To build the Nomicon use the `build` sub-command: 35 | 36 | ```bash 37 | mdbook build 38 | ``` 39 | 40 | The output will be placed in the `book` subdirectory. To check it out, open the 41 | `index.html` file in your web browser. You can pass the `--open` flag to `mdbook 42 | build` and it'll open the index page in your default browser (if the process is 43 | successful) just like with `cargo doc --open`: 44 | 45 | ```bash 46 | mdbook build --open 47 | ``` 48 | 49 | There is also a `test` sub-command to test all code samples contained in the book: 50 | 51 | ```bash 52 | mdbook test 53 | ``` 54 | 55 | ### `linkcheck` 56 | 57 | We use the `linkcheck` tool to find broken links. 58 | To run it locally: 59 | 60 | ```sh 61 | curl -sSLo linkcheck.sh https://raw.githubusercontent.com/rust-lang/rust/master/src/tools/linkchecker/linkcheck.sh 62 | sh linkcheck.sh --all nomicon 63 | ``` 64 | 65 | ## Contributing 66 | 67 | Given that the Nomicon is still in a draft state, we'd love your help! Please 68 | feel free to open issues about anything, and send in PRs for things you'd like 69 | to fix or change. If your change is large, please open an issue first, so we can 70 | make sure that it's something we'd accept before you go through the work of 71 | getting a PR together. 72 | -------------------------------------------------------------------------------- /src/constructors.md: -------------------------------------------------------------------------------- 1 | # Constructors 2 | 3 | There is exactly one way to create an instance of a user-defined type: name it, 4 | and initialize all its fields at once: 5 | 6 | ```rust 7 | struct Foo { 8 | a: u8, 9 | b: u32, 10 | c: bool, 11 | } 12 | 13 | enum Bar { 14 | X(u32), 15 | Y(bool), 16 | } 17 | 18 | struct Unit; 19 | 20 | let foo = Foo { a: 0, b: 1, c: false }; 21 | let bar = Bar::X(0); 22 | let empty = Unit; 23 | ``` 24 | 25 | That's it. Every other way you make an instance of a type is just calling a 26 | totally vanilla function that does some stuff and eventually bottoms out to The 27 | One True Constructor. 28 | 29 | Unlike C++, Rust does not come with a slew of built-in kinds of constructor. 30 | There are no Copy, Default, Assignment, Move, or whatever constructors. The 31 | reasons for this are varied, but it largely boils down to Rust's philosophy of 32 | *being explicit*. 33 | 34 | Move constructors are meaningless in Rust because we don't enable types to 35 | "care" about their location in memory. Every type must be ready for it to be 36 | blindly memcopied to somewhere else in memory. This means pure on-the-stack-but- 37 | still-movable intrusive linked lists are simply not happening in Rust (safely). 38 | 39 | Assignment and copy constructors similarly don't exist because move semantics 40 | are the only semantics in Rust. At most `x = y` just moves the bits of y into 41 | the x variable. Rust does provide two facilities for providing C++'s copy- 42 | oriented semantics: `Copy` and `Clone`. Clone is our moral equivalent of a copy 43 | constructor, but it's never implicitly invoked. You have to explicitly call 44 | `clone` on an element you want to be cloned. Copy is a special case of Clone 45 | where the implementation is just "copy the bits". Copy types *are* implicitly 46 | cloned whenever they're moved, but because of the definition of Copy this just 47 | means not treating the old copy as uninitialized -- a no-op. 48 | 49 | While Rust provides a `Default` trait for specifying the moral equivalent of a 50 | default constructor, it's incredibly rare for this trait to be used. This is 51 | because variables [aren't implicitly initialized][uninit]. Default is basically 52 | only useful for generic programming. In concrete contexts, a type will provide a 53 | static `new` method for any kind of "default" constructor. This has no relation 54 | to `new` in other languages and has no special meaning. It's just a naming 55 | convention. 56 | 57 | TODO: talk about "placement new"? 58 | 59 | [uninit]: uninitialized.html 60 | -------------------------------------------------------------------------------- /src/arc-mutex/arc-final.md: -------------------------------------------------------------------------------- 1 | # Final Code 2 | 3 | Here's the final code, with some added comments and re-ordered imports: 4 | 5 | ```rust 6 | use std::marker::PhantomData; 7 | use std::ops::Deref; 8 | use std::ptr::NonNull; 9 | use std::sync::atomic::{self, AtomicUsize, Ordering}; 10 | 11 | pub struct Arc { 12 | ptr: NonNull>, 13 | phantom: PhantomData>, 14 | } 15 | 16 | pub struct ArcInner { 17 | rc: AtomicUsize, 18 | data: T, 19 | } 20 | 21 | impl Arc { 22 | pub fn new(data: T) -> Arc { 23 | // We start the reference count at 1, as that first reference is the 24 | // current pointer. 25 | let boxed = Box::new(ArcInner { 26 | rc: AtomicUsize::new(1), 27 | data, 28 | }); 29 | Arc { 30 | // It is okay to call `.unwrap()` here as we get a pointer from 31 | // `Box::into_raw` which is guaranteed to not be null. 32 | ptr: NonNull::new(Box::into_raw(boxed)).unwrap(), 33 | phantom: PhantomData, 34 | } 35 | } 36 | } 37 | 38 | unsafe impl Send for Arc {} 39 | unsafe impl Sync for Arc {} 40 | 41 | impl Deref for Arc { 42 | type Target = T; 43 | 44 | fn deref(&self) -> &T { 45 | let inner = unsafe { self.ptr.as_ref() }; 46 | &inner.data 47 | } 48 | } 49 | 50 | impl Clone for Arc { 51 | fn clone(&self) -> Arc { 52 | let inner = unsafe { self.ptr.as_ref() }; 53 | // Using a relaxed ordering is alright here as we don't need any atomic 54 | // synchronization here as we're not modifying or accessing the inner 55 | // data. 56 | let old_rc = inner.rc.fetch_add(1, Ordering::Relaxed); 57 | 58 | if old_rc >= isize::MAX as usize { 59 | std::process::abort(); 60 | } 61 | 62 | Self { 63 | ptr: self.ptr, 64 | phantom: PhantomData, 65 | } 66 | } 67 | } 68 | 69 | impl Drop for Arc { 70 | fn drop(&mut self) { 71 | let inner = unsafe { self.ptr.as_ref() }; 72 | if inner.rc.fetch_sub(1, Ordering::Release) != 1 { 73 | return; 74 | } 75 | // This fence is needed to prevent reordering of the use and deletion 76 | // of the data. 77 | atomic::fence(Ordering::Acquire); 78 | // This is safe as we know we have the last pointer to the `ArcInner` 79 | // and that its pointer is valid. 80 | unsafe { Box::from_raw(self.ptr.as_ptr()); } 81 | } 82 | } 83 | ``` 84 | -------------------------------------------------------------------------------- /src/SUMMARY.md: -------------------------------------------------------------------------------- 1 | # Summary 2 | 3 | [Introduction](intro.md) 4 | 5 | * [Meet Safe and Unsafe](meet-safe-and-unsafe.md) 6 | * [How Safe and Unsafe Interact](safe-unsafe-meaning.md) 7 | * [What Unsafe Can Do](what-unsafe-does.md) 8 | * [Working with Unsafe](working-with-unsafe.md) 9 | * [Data Layout](data.md) 10 | * [repr(Rust)](repr-rust.md) 11 | * [Exotically Sized Types](exotic-sizes.md) 12 | * [Other reprs](other-reprs.md) 13 | * [Ownership](ownership.md) 14 | * [References](references.md) 15 | * [Aliasing](aliasing.md) 16 | * [Lifetimes](lifetimes.md) 17 | * [Limits of Lifetimes](lifetime-mismatch.md) 18 | * [Lifetime Elision](lifetime-elision.md) 19 | * [Unbounded Lifetimes](unbounded-lifetimes.md) 20 | * [Higher-Rank Trait Bounds](hrtb.md) 21 | * [Subtyping and Variance](subtyping.md) 22 | * [Drop Check](dropck.md) 23 | * [PhantomData](phantom-data.md) 24 | * [Splitting Borrows](borrow-splitting.md) 25 | * [Type Conversions](conversions.md) 26 | * [Coercions](coercions.md) 27 | * [The Dot Operator](dot-operator.md) 28 | * [Casts](casts.md) 29 | * [Transmutes](transmutes.md) 30 | * [Uninitialized Memory](uninitialized.md) 31 | * [Checked](checked-uninit.md) 32 | * [Drop Flags](drop-flags.md) 33 | * [Unchecked](unchecked-uninit.md) 34 | * [Ownership Based Resource Management](obrm.md) 35 | * [Constructors](constructors.md) 36 | * [Destructors](destructors.md) 37 | * [Leaking](leaking.md) 38 | * [Unwinding](unwinding.md) 39 | * [Exception Safety](exception-safety.md) 40 | * [Poisoning](poisoning.md) 41 | * [Concurrency](concurrency.md) 42 | * [Races](races.md) 43 | * [Send and Sync](send-and-sync.md) 44 | * [Atomics](atomics.md) 45 | * [Implementing Vec](./vec/vec.md) 46 | * [Layout](./vec/vec-layout.md) 47 | * [Allocating](./vec/vec-alloc.md) 48 | * [Push and Pop](./vec/vec-push-pop.md) 49 | * [Deallocating](./vec/vec-dealloc.md) 50 | * [Deref](./vec/vec-deref.md) 51 | * [Insert and Remove](./vec/vec-insert-remove.md) 52 | * [IntoIter](./vec/vec-into-iter.md) 53 | * [RawVec](./vec/vec-raw.md) 54 | * [Drain](./vec/vec-drain.md) 55 | * [Handling Zero-Sized Types](./vec/vec-zsts.md) 56 | * [Final Code](./vec/vec-final.md) 57 | * [Implementing Arc and Mutex](./arc-mutex/arc-and-mutex.md) 58 | * [Arc](./arc-mutex/arc.md) 59 | * [Layout](./arc-mutex/arc-layout.md) 60 | * [Base Code](./arc-mutex/arc-base.md) 61 | * [Cloning](./arc-mutex/arc-clone.md) 62 | * [Dropping](./arc-mutex/arc-drop.md) 63 | * [Final Code](./arc-mutex/arc-final.md) 64 | * [FFI](ffi.md) 65 | * [Beneath `std`](beneath-std.md) 66 | * [#[panic_handler]](panic-handler.md) 67 | -------------------------------------------------------------------------------- /src/lifetime-elision.md: -------------------------------------------------------------------------------- 1 | # Lifetime Elision 2 | 3 | In order to make common patterns more ergonomic, Rust allows lifetimes to be 4 | *elided* in function signatures. 5 | 6 | A *lifetime position* is anywhere you can write a lifetime in a type: 7 | 8 | 9 | ```rust,ignore 10 | &'a T 11 | &'a mut T 12 | T<'a> 13 | ``` 14 | 15 | Lifetime positions can appear as either "input" or "output": 16 | 17 | * For `fn` definitions, `fn` types, and the traits `Fn`, `FnMut`, and `FnOnce`, 18 | input refers to the types of the formal arguments, while output refers to 19 | result types. So `fn foo(s: &str) -> (&str, &str)` has elided one lifetime in 20 | input position and two lifetimes in output position. Note that the input 21 | positions of a `fn` method definition do not include the lifetimes that occur 22 | in the method's `impl` header (nor lifetimes that occur in the trait header, 23 | for a default method). 24 | 25 | * For `impl` headers, all types are input. So `impl Trait<&T> for Struct<&T>` 26 | has elided two lifetimes in input position, while `impl Struct<&T>` has elided 27 | one. 28 | 29 | Elision rules are as follows: 30 | 31 | * Each elided lifetime in input position becomes a distinct lifetime 32 | parameter. 33 | 34 | * If there is exactly one input lifetime position (elided or not), that lifetime 35 | is assigned to *all* elided output lifetimes. 36 | 37 | * If there are multiple input lifetime positions, but one of them is `&self` or 38 | `&mut self`, the lifetime of `self` is assigned to *all* elided output lifetimes. 39 | 40 | * Otherwise, it is an error to elide an output lifetime. 41 | 42 | Examples: 43 | 44 | 45 | ```rust,ignore 46 | fn print(s: &str); // elided 47 | fn print<'a>(s: &'a str); // expanded 48 | 49 | fn debug(lvl: usize, s: &str); // elided 50 | fn debug<'a>(lvl: usize, s: &'a str); // expanded 51 | 52 | fn substr(s: &str, until: usize) -> &str; // elided 53 | fn substr<'a>(s: &'a str, until: usize) -> &'a str; // expanded 54 | 55 | fn get_str() -> &str; // ILLEGAL 56 | 57 | fn frob(s: &str, t: &str) -> &str; // ILLEGAL 58 | 59 | fn get_mut(&mut self) -> &mut T; // elided 60 | fn get_mut<'a>(&'a mut self) -> &'a mut T; // expanded 61 | 62 | fn args(&mut self, args: &[T]) -> &mut Command // elided 63 | fn args<'a, 'b, T: ToCStr>(&'a mut self, args: &'b [T]) -> &'a mut Command // expanded 64 | 65 | fn new(buf: &mut [u8]) -> BufWriter; // elided 66 | fn new(buf: &mut [u8]) -> BufWriter<'_>; // elided (with `rust_2018_idioms`) 67 | fn new<'a>(buf: &'a mut [u8]) -> BufWriter<'a> // expanded 68 | ``` 69 | -------------------------------------------------------------------------------- /src/ownership.md: -------------------------------------------------------------------------------- 1 | # Ownership and Lifetimes 2 | 3 | Ownership is the breakout feature of Rust. It allows Rust to be completely 4 | memory-safe and efficient, while avoiding garbage collection. Before getting 5 | into the ownership system in detail, we will consider the motivation of this 6 | design. 7 | 8 | We will assume that you accept that garbage collection (GC) is not always an 9 | optimal solution, and that it is desirable to manually manage memory in some 10 | contexts. If you do not accept this, might I interest you in a different 11 | language? 12 | 13 | Regardless of your feelings on GC, it is pretty clearly a *massive* boon to 14 | making code safe. You never have to worry about things going away *too soon* 15 | (although whether you still wanted to be pointing at that thing is a different 16 | issue...). This is a pervasive problem that C and C++ programs need to deal 17 | with. Consider this simple mistake that all of us who have used a non-GC'd 18 | language have made at one point: 19 | 20 | ```rust,compile_fail 21 | fn as_str(data: &u32) -> &str { 22 | // compute the string 23 | let s = format!("{}", data); 24 | 25 | // OH NO! We returned a reference to something that 26 | // exists only in this function! 27 | // Dangling pointer! Use after free! Alas! 28 | // (this does not compile in Rust) 29 | &s 30 | } 31 | ``` 32 | 33 | This is exactly what Rust's ownership system was built to solve. 34 | Rust knows the scope in which the `&s` lives, and as such can prevent it from 35 | escaping. However this is a simple case that even a C compiler could plausibly 36 | catch. Things get more complicated as code gets bigger and pointers get fed through 37 | various functions. Eventually, a C compiler will fall down and won't be able to 38 | perform sufficient escape analysis to prove your code unsound. It will consequently 39 | be forced to accept your program on the assumption that it is correct. 40 | 41 | This will never happen to Rust. It's up to the programmer to prove to the 42 | compiler that everything is sound. 43 | 44 | Of course, Rust's story around ownership is much more complicated than just 45 | verifying that references don't escape the scope of their referent. That's 46 | because ensuring pointers are always valid is much more complicated than this. 47 | For instance in this code, 48 | 49 | ```rust,compile_fail 50 | let mut data = vec![1, 2, 3]; 51 | // get an internal reference 52 | let x = &data[0]; 53 | 54 | // OH NO! `push` causes the backing storage of `data` to be reallocated. 55 | // Dangling pointer! Use after free! Alas! 56 | // (this does not compile in Rust) 57 | data.push(4); 58 | 59 | println!("{}", x); 60 | ``` 61 | 62 | naive scope analysis would be insufficient to prevent this bug, because `data` 63 | does in fact live as long as we needed. However it was *changed* while we had 64 | a reference into it. This is why Rust requires any references to freeze the 65 | referent and its owners. 66 | -------------------------------------------------------------------------------- /src/arc-mutex/arc-layout.md: -------------------------------------------------------------------------------- 1 | # Layout 2 | 3 | Let's start by making the layout for our implementation of `Arc`. 4 | 5 | An `Arc` provides thread-safe shared ownership of a value of type `T`, 6 | allocated in the heap. Sharing implies immutability in Rust, so we don't need to 7 | design anything that manages access to that value, right? Although interior 8 | mutability types like Mutex allow Arc's users to create shared mutability, Arc 9 | itself doesn't need to concern itself with these issues. 10 | 11 | However there _is_ one place where Arc needs to concern itself with mutation: 12 | destruction. When all the owners of the Arc go away, we need to be able to 13 | `drop` its contents and free its allocation. So we need a way for an owner to 14 | know if it's the _last_ owner, and the simplest way to do that is with a count 15 | of the owners -- Reference Counting. 16 | 17 | Unfortunately, this reference count is inherently shared mutable state, so Arc 18 | _does_ need to think about synchronization. We _could_ use a Mutex for this, but 19 | that's overkill. Instead, we'll use atomics. And since everyone already needs a 20 | pointer to the T's allocation, we might as well put the reference count in that 21 | same allocation. 22 | 23 | Naively, it would look something like this: 24 | 25 | ```rust 26 | use std::sync::atomic; 27 | 28 | pub struct Arc { 29 | ptr: *mut ArcInner, 30 | } 31 | 32 | pub struct ArcInner { 33 | rc: atomic::AtomicUsize, 34 | data: T, 35 | } 36 | ``` 37 | 38 | This would compile, however it would be incorrect. First of all, the compiler 39 | will give us too strict variance. For example, an `Arc<&'static str>` couldn't 40 | be used where an `Arc<&'a str>` was expected. More importantly, it will give 41 | incorrect ownership information to the drop checker, as it will assume we don't 42 | own any values of type `T`. As this is a structure providing shared ownership of 43 | a value, at some point there will be an instance of this structure that entirely 44 | owns its data. See [the chapter on ownership and lifetimes](../ownership.md) for 45 | all the details on variance and drop check. 46 | 47 | To fix the first problem, we can use `NonNull`. Note that `NonNull` is a 48 | wrapper around a raw pointer that declares that: 49 | 50 | * We are covariant over `T` 51 | * Our pointer is never null 52 | 53 | To fix the second problem, we can include a `PhantomData` marker containing an 54 | `ArcInner`. This will tell the drop checker that we have some notion of 55 | ownership of a value of `ArcInner` (which itself contains some `T`). 56 | 57 | With these changes we get our final structure: 58 | 59 | ```rust 60 | use std::marker::PhantomData; 61 | use std::ptr::NonNull; 62 | use std::sync::atomic::AtomicUsize; 63 | 64 | pub struct Arc { 65 | ptr: NonNull>, 66 | phantom: PhantomData>, 67 | } 68 | 69 | pub struct ArcInner { 70 | rc: AtomicUsize, 71 | data: T, 72 | } 73 | ``` 74 | -------------------------------------------------------------------------------- /src/hrtb.md: -------------------------------------------------------------------------------- 1 | # Higher-Rank Trait Bounds (HRTBs) 2 | 3 | Rust's `Fn` traits are a little bit magic. For instance, we can write the 4 | following code: 5 | 6 | ```rust 7 | struct Closure { 8 | data: (u8, u16), 9 | func: F, 10 | } 11 | 12 | impl Closure 13 | where F: Fn(&(u8, u16)) -> &u8, 14 | { 15 | fn call(&self) -> &u8 { 16 | (self.func)(&self.data) 17 | } 18 | } 19 | 20 | fn do_it(data: &(u8, u16)) -> &u8 { &data.0 } 21 | 22 | fn main() { 23 | let clo = Closure { data: (0, 1), func: do_it }; 24 | println!("{}", clo.call()); 25 | } 26 | ``` 27 | 28 | If we try to naively desugar this code in the same way that we did in the 29 | [lifetimes section][lt], we run into some trouble: 30 | 31 | 32 | ```rust,ignore 33 | // NOTE: `&'b data.0` and `'x: {` is not valid syntax! 34 | struct Closure { 35 | data: (u8, u16), 36 | func: F, 37 | } 38 | 39 | impl Closure 40 | // where F: Fn(&'??? (u8, u16)) -> &'??? u8, 41 | { 42 | fn call<'a>(&'a self) -> &'a u8 { 43 | (self.func)(&self.data) 44 | } 45 | } 46 | 47 | fn do_it<'b>(data: &'b (u8, u16)) -> &'b u8 { &'b data.0 } 48 | 49 | fn main() { 50 | 'x: { 51 | let clo = Closure { data: (0, 1), func: do_it }; 52 | println!("{}", clo.call()); 53 | } 54 | } 55 | ``` 56 | 57 | How on earth are we supposed to express the lifetimes on `F`'s trait bound? We 58 | need to provide some lifetime there, but the lifetime we care about can't be 59 | named until we enter the body of `call`! Also, that isn't some fixed lifetime; 60 | `call` works with *any* lifetime `&self` happens to have at that point. 61 | 62 | This job requires The Magic of Higher-Rank Trait Bounds (HRTBs). The way we 63 | desugar this is as follows: 64 | 65 | 66 | ```rust,ignore 67 | where for<'a> F: Fn(&'a (u8, u16)) -> &'a u8, 68 | ``` 69 | 70 | Alternatively: 71 | 72 | 73 | ```rust,ignore 74 | where F: for<'a> Fn(&'a (u8, u16)) -> &'a u8, 75 | ``` 76 | 77 | (Where `Fn(a, b, c) -> d` is itself just sugar for the unstable *real* `Fn` 78 | trait) 79 | 80 | `for<'a>` can be read as "for all choices of `'a`", and basically produces an 81 | *infinite list* of trait bounds that F must satisfy. Intense. There aren't many 82 | places outside of the `Fn` traits where we encounter HRTBs, and even for 83 | those we have a nice magic sugar for the common cases. 84 | 85 | In summary, we can rewrite the original code more explicitly as: 86 | 87 | ```rust 88 | struct Closure { 89 | data: (u8, u16), 90 | func: F, 91 | } 92 | 93 | impl Closure 94 | where for<'a> F: Fn(&'a (u8, u16)) -> &'a u8, 95 | { 96 | fn call(&self) -> &u8 { 97 | (self.func)(&self.data) 98 | } 99 | } 100 | 101 | fn do_it(data: &(u8, u16)) -> &u8 { &data.0 } 102 | 103 | fn main() { 104 | let clo = Closure { data: (0, 1), func: do_it }; 105 | println!("{}", clo.call()); 106 | } 107 | ``` 108 | 109 | [lt]: lifetimes.html 110 | -------------------------------------------------------------------------------- /src/unwinding.md: -------------------------------------------------------------------------------- 1 | # Unwinding 2 | 3 | Rust has a *tiered* error-handling scheme: 4 | 5 | * If something might reasonably be absent, Option is used. 6 | * If something goes wrong and can reasonably be handled, Result is used. 7 | * If something goes wrong and cannot reasonably be handled, the thread panics. 8 | * If something catastrophic happens, the program aborts. 9 | 10 | Option and Result are overwhelmingly preferred in most situations, especially 11 | since they can be promoted into a panic or abort at the API user's discretion. 12 | Panics cause the thread to halt normal execution and unwind its stack, calling 13 | destructors as if every function instantly returned. 14 | 15 | As of 1.0, Rust is of two minds when it comes to panics. In the long-long-ago, 16 | Rust was much more like Erlang. Like Erlang, Rust had lightweight tasks, 17 | and tasks were intended to kill themselves with a panic when they reached an 18 | untenable state. Unlike an exception in Java or C++, a panic could not be 19 | caught at any time. Panics could only be caught by the owner of the task, at which 20 | point they had to be handled or *that* task would itself panic. 21 | 22 | Unwinding was important to this story because if a task's 23 | destructors weren't called, it would cause memory and other system resources to 24 | leak. Since tasks were expected to die during normal execution, this would make 25 | Rust very poor for long-running systems! 26 | 27 | As the Rust we know today came to be, this style of programming grew out of 28 | fashion in the push for less-and-less abstraction. Light-weight tasks were 29 | killed in the name of heavy-weight OS threads. Still, on stable Rust as of 1.0 30 | panics can only be caught by the parent thread. This means catching a panic 31 | requires spinning up an entire OS thread! This unfortunately stands in conflict 32 | to Rust's philosophy of zero-cost abstractions. 33 | 34 | There is an API called [`catch_unwind`] that enables catching a panic 35 | without spawning a thread. Still, we would encourage you to only do this 36 | sparingly. In particular, Rust's current unwinding implementation is heavily 37 | optimized for the "doesn't unwind" case. If a program doesn't unwind, there 38 | should be no runtime cost for the program being *ready* to unwind. As a 39 | consequence, actually unwinding will be more expensive than in e.g. Java. 40 | Don't build your programs to unwind under normal circumstances. Ideally, you 41 | should only panic for programming errors or *extreme* problems. 42 | 43 | Rust's unwinding strategy is not specified to be fundamentally compatible 44 | with any other language's unwinding. As such, unwinding into Rust from another 45 | language, or unwinding into another language from Rust is Undefined Behavior. 46 | You must *absolutely* catch any panics at the FFI boundary! What you do at that 47 | point is up to you, but *something* must be done. If you fail to do this, 48 | at best, your application will crash and burn. At worst, your application *won't* 49 | crash and burn, and will proceed with completely clobbered state. 50 | 51 | [`catch_unwind`]: https://doc.rust-lang.org/std/panic/fn.catch_unwind.html 52 | -------------------------------------------------------------------------------- /src/transmutes.md: -------------------------------------------------------------------------------- 1 | # Transmutes 2 | 3 | Get out of our way type system! We're going to reinterpret these bits or die 4 | trying! Even though this book is all about doing things that are unsafe, I 5 | really can't emphasize enough that you should deeply think about finding Another Way 6 | than the operations covered in this section. This is really, truly, the most 7 | horribly unsafe thing you can do in Rust. The guardrails here are dental floss. 8 | 9 | [`mem::transmute`][transmute] takes a value of type `T` and reinterprets 10 | it to have type `U`. The only restriction is that the `T` and `U` are verified 11 | to have the same size. The ways to cause Undefined Behavior with this are mind 12 | boggling. 13 | 14 | * First and foremost, creating an instance of *any* type with an invalid state 15 | is going to cause arbitrary chaos that can't really be predicted. Do not 16 | transmute `3` to `bool`. Even if you never *do* anything with the `bool`. Just 17 | don't. 18 | 19 | * Transmute has an overloaded return type. If you do not specify the return type 20 | it may produce a surprising type to satisfy inference. 21 | 22 | * Transmuting an `&` to `&mut` is Undefined Behavior. While certain usages may 23 | *appear* safe, note that the Rust optimizer is free to assume that a shared 24 | reference won't change through its lifetime and thus such transmutation will 25 | run afoul of those assumptions. So: 26 | * Transmuting an `&` to `&mut` is *always* Undefined Behavior. 27 | * No you can't do it. 28 | * No you're not special. 29 | 30 | * Transmuting to a reference without an explicitly provided lifetime 31 | produces an [unbounded lifetime]. 32 | 33 | * When transmuting between different compound types, you have to make sure they 34 | are laid out the same way! If layouts differ, the wrong fields are going to 35 | get filled with the wrong data, which will make you unhappy and can also be 36 | Undefined Behavior (see above). 37 | 38 | So how do you know if the layouts are the same? For `repr(C)` types and 39 | `repr(transparent)` types, layout is precisely defined. But for your 40 | run-of-the-mill `repr(Rust)`, it is not. Even different instances of the same 41 | generic type can have wildly different layout. `Vec` and `Vec` 42 | *might* have their fields in the same order, or they might not. The details of 43 | what exactly is and is not guaranteed for data layout are still being worked 44 | out over [at the UCG WG][ucg-layout]. 45 | 46 | [`mem::transmute_copy`][transmute_copy] somehow manages to be *even more* 47 | wildly unsafe than this. It copies `size_of` bytes out of an `&T` and 48 | interprets them as a `U`. The size check that `mem::transmute` has is gone (as 49 | it may be valid to copy out a prefix), though it is Undefined Behavior for `U` 50 | to be larger than `T`. 51 | 52 | Also of course you can get all of the functionality of these functions using raw 53 | pointer casts or `union`s, but without any of the lints or other basic sanity 54 | checks. Raw pointer casts and `union`s do not magically avoid the above rules. 55 | 56 | [unbounded lifetime]: ./unbounded-lifetimes.md 57 | [transmute]: ../std/mem/fn.transmute.html 58 | [transmute_copy]: ../std/mem/fn.transmute_copy.html 59 | [ucg-layout]: https://rust-lang.github.io/unsafe-code-guidelines/layout.html 60 | -------------------------------------------------------------------------------- /src/beneath-std.md: -------------------------------------------------------------------------------- 1 | # Beneath `std` 2 | 3 | This section documents features that are normally provided by the `std` crate and 4 | that `#![no_std]` developers have to deal with (i.e. provide) to build 5 | `#![no_std]` binary crates. 6 | 7 | ## Using `libc` 8 | 9 | In order to build a `#[no_std]` executable we will need `libc` as a dependency. 10 | We can specify this using our `Cargo.toml` file: 11 | 12 | ```toml 13 | [dependencies] 14 | libc = { version = "0.2.146", default-features = false } 15 | ``` 16 | 17 | Note that the default features have been disabled. This is a critical step - 18 | **the default features of `libc` include the `std` crate and so must be 19 | disabled.** 20 | 21 | Alternatively, we can use the unstable `rustc_private` private feature together 22 | with an `extern crate libc;` declaration as shown in the examples below. Note that 23 | windows-msvc targets do not require a libc, and correspondingly there is no `libc` 24 | crate in their sysroot. We do not need the `extern crate libc;` below, and having it 25 | on a windows-msvc target would be a compile error. 26 | 27 | ## Writing an executable without `std` 28 | 29 | We will probably need a nightly version of the compiler to produce 30 | a `#![no_std]` executable because on many platforms, we have to provide the 31 | `eh_personality` [lang item], which is unstable. 32 | 33 | You will need to define a symbol for the entry point that is suitable for your target. For example, `main`, `_start`, `WinMain`, or whatever starting point is relevant for your target. 34 | Additionally, you need to use the `#![no_main]` attribute to prevent the compiler from attempting to generate an entry point itself. 35 | 36 | Additionally, it's required to define a [panic handler function](panic-handler.html). 37 | 38 | ```rust 39 | #![feature(lang_items, core_intrinsics, rustc_private)] 40 | #![allow(internal_features)] 41 | #![no_std] 42 | #![no_main] 43 | 44 | // Necessary for `panic = "unwind"` builds on cfg(unix) platforms. 45 | #![feature(panic_unwind)] 46 | extern crate unwind; 47 | 48 | // Pull in the system libc library for what crt0.o likely requires. 49 | #[cfg(not(windows))] 50 | extern crate libc; 51 | 52 | use core::ffi::{c_char, c_int}; 53 | use core::panic::PanicInfo; 54 | 55 | // Entry point for this program. 56 | #[unsafe(no_mangle)] // ensure that this symbol is included in the output as `main` 57 | extern "C" fn main(_argc: c_int, _argv: *const *const c_char) -> c_int { 58 | 0 59 | } 60 | 61 | // These functions are used by the compiler, but not for an empty program like this. 62 | // They are normally provided by `std`. 63 | #[lang = "eh_personality"] 64 | fn rust_eh_personality() {} 65 | #[panic_handler] 66 | fn panic_handler(_info: &PanicInfo) -> ! { core::intrinsics::abort() } 67 | ``` 68 | 69 | If you are working with a target that doesn't have binary releases of the 70 | standard library available via rustup (this probably means you are building the 71 | `core` crate yourself) and need compiler-rt intrinsics (i.e. you are probably 72 | getting linker errors when building an executable: 73 | ``undefined reference to `__aeabi_memcpy'``), you need to manually link to the 74 | [`compiler_builtins` crate] to get those intrinsics and solve the linker errors. 75 | 76 | [`compiler_builtins` crate]: https://crates.io/crates/compiler_builtins 77 | [lang item]: https://doc.rust-lang.org/nightly/unstable-book/language-features/lang-items.html 78 | -------------------------------------------------------------------------------- /src/intro.md: -------------------------------------------------------------------------------- 1 | # The Rustonomicon 2 | 3 |
4 | 5 | Warning: 6 | This book is incomplete. 7 | Documenting everything and rewriting outdated parts take a while. 8 | See the [issue tracker] to check what's missing/outdated, and if there are any mistakes or ideas that haven't been reported, feel free to open a new issue there. 9 | 10 |
11 | 12 | [issue tracker]: https://github.com/rust-lang/nomicon/issues 13 | 14 | ## The Dark Arts of Unsafe Rust 15 | 16 | > THE KNOWLEDGE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF UNLEASHING INDESCRIBABLE HORRORS THAT SHATTER YOUR PSYCHE AND SET YOUR MIND ADRIFT IN THE UNKNOWABLY INFINITE COSMOS. 17 | 18 | The Rustonomicon digs into all the awful details that you need to understand when writing Unsafe Rust programs. 19 | 20 | Should you wish a long and happy career of writing Rust programs, you should turn back now and forget you ever saw this book. 21 | It is not necessary. 22 | However if you intend to write unsafe code — or just want to dig into the guts of the language — this book contains lots of useful information. 23 | 24 | Unlike *[The Rust Programming Language][trpl]*, we will be assuming considerable prior knowledge. 25 | In particular, you should be comfortable with basic systems programming and Rust. 26 | If you don't feel comfortable with these topics, you should consider reading [The Book][trpl] first. 27 | That said, we won't assume you have read it, and we will take care to occasionally give a refresher on the basics where appropriate. 28 | You can skip straight to this book if you want; just know that we won't be explaining everything from the ground up. 29 | 30 | This book exists primarily as a high-level companion to [The Reference][ref]. 31 | Where The Reference exists to detail the syntax and semantics of every part of the language, The Rustonomicon exists to describe how to use those pieces together, and the issues that you will have in doing so. 32 | 33 | The Reference will tell you the syntax and semantics of references, destructors, and unwinding, but it won't tell you how combining them can lead to exception-safety issues, or how to deal with those issues. 34 | 35 | It should be noted that we haven't synced The Rustnomicon and The Reference well, so they may have duplicate content. 36 | In general, if the two documents disagree, The Reference should be assumed to be correct (it isn't yet considered normative, it's just better maintained). 37 | 38 | Topics that are within the scope of this book include: the meaning of (un)safety, unsafe primitives provided by the language and standard library, techniques for creating safe abstractions with those unsafe primitives, subtyping and variance, exception-safety (panic/unwind-safety), working with uninitialized memory, type punning, concurrency, interoperating with other languages (FFI), optimization tricks, how constructs lower to compiler/OS/hardware primitives, how to **not** make the memory model people angry, how you're **going** to make the memory model people angry, and more. 39 | 40 | The Rustonomicon is not a place to exhaustively describe the semantics and guarantees of every single API in the standard library, nor is it a place to exhaustively describe every feature of Rust. 41 | 42 | Unless otherwise noted, Rust code in this book uses the Rust 2024 edition. 43 | 44 | [trpl]: ../book/index.html 45 | [ref]: ../reference/index.html 46 | -------------------------------------------------------------------------------- /src/checked-uninit.md: -------------------------------------------------------------------------------- 1 | # Checked Uninitialized Memory 2 | 3 | Like C, all stack variables in Rust are uninitialized until a value is 4 | explicitly assigned to them. Unlike C, Rust statically prevents you from ever 5 | reading them until you do: 6 | 7 | ```rust,compile_fail 8 | fn main() { 9 | let x: i32; 10 | println!("{}", x); 11 | } 12 | ``` 13 | 14 | ```text 15 | | 16 | 3 | println!("{}", x); 17 | | ^ use of possibly uninitialized `x` 18 | ``` 19 | 20 | This is based off of a basic branch analysis: every branch must assign a value 21 | to `x` before it is first used. For short, we also say that "`x` is init" or 22 | "`x` is uninit". 23 | 24 | Interestingly, Rust doesn't require the variable 25 | to be mutable to perform a delayed initialization if every branch assigns 26 | exactly once. However the analysis does not take advantage of constant analysis 27 | or anything like that. So this compiles: 28 | 29 | ```rust 30 | fn main() { 31 | let x: i32; 32 | 33 | if true { 34 | x = 1; 35 | } else { 36 | x = 2; 37 | } 38 | 39 | println!("{}", x); 40 | } 41 | ``` 42 | 43 | but this doesn't: 44 | 45 | ```rust,compile_fail 46 | fn main() { 47 | let x: i32; 48 | if true { 49 | x = 1; 50 | } 51 | println!("{}", x); 52 | } 53 | ``` 54 | 55 | ```text 56 | | 57 | 6 | println!("{}", x); 58 | | ^ use of possibly uninitialized `x` 59 | ``` 60 | 61 | while this does: 62 | 63 | ```rust 64 | fn main() { 65 | let x: i32; 66 | if true { 67 | x = 1; 68 | println!("{}", x); 69 | } 70 | // Don't care that there are branches where it's not initialized 71 | // since we don't use the value in those branches 72 | } 73 | ``` 74 | 75 | Of course, while the analysis doesn't consider actual values, it does 76 | have a relatively sophisticated understanding of dependencies and control 77 | flow. For instance, this works: 78 | 79 | ```rust 80 | let x: i32; 81 | 82 | loop { 83 | // Rust doesn't understand that this branch will be taken unconditionally, 84 | // because it relies on actual values. 85 | if true { 86 | // But it does understand that it will only be taken once because 87 | // we unconditionally break out of it. Therefore `x` doesn't 88 | // need to be marked as mutable. 89 | x = 0; 90 | break; 91 | } 92 | } 93 | // It also knows that it's impossible to get here without reaching the break. 94 | // And therefore that `x` must be initialized here! 95 | println!("{}", x); 96 | ``` 97 | 98 | If a value is moved out of a variable, that variable becomes logically 99 | uninitialized if the type of the value isn't Copy. That is: 100 | 101 | ```rust 102 | fn main() { 103 | let x = 0; 104 | let y = Box::new(0); 105 | let z1 = x; // x is still valid because i32 is Copy 106 | let z2 = y; // y is now logically uninitialized because Box isn't Copy 107 | } 108 | ``` 109 | 110 | However reassigning `y` in this example *would* require `y` to be marked as 111 | mutable, as a Safe Rust program could observe that the value of `y` changed: 112 | 113 | ```rust 114 | fn main() { 115 | let mut y = Box::new(0); 116 | let z = y; // y is now logically uninitialized because Box isn't Copy 117 | y = Box::new(1); // reinitialize y 118 | } 119 | ``` 120 | 121 | Otherwise it's like `y` is a brand new variable. 122 | -------------------------------------------------------------------------------- /src/drop-flags.md: -------------------------------------------------------------------------------- 1 | # Drop Flags 2 | 3 | The examples in the previous section introduce an interesting problem for Rust. 4 | We have seen that it's possible to conditionally initialize, deinitialize, and 5 | reinitialize locations of memory totally safely. For Copy types, this isn't 6 | particularly notable since they're just a random pile of bits. However types 7 | with destructors are a different story: Rust needs to know whether to call a 8 | destructor whenever a variable is assigned to, or a variable goes out of scope. 9 | How can it do this with conditional initialization? 10 | 11 | Note that this is not a problem that all assignments need worry about. In 12 | particular, assigning through a dereference unconditionally drops, and assigning 13 | in a `let` unconditionally doesn't drop: 14 | 15 | ```rust 16 | let mut x = Box::new(0); // let makes a fresh variable, so never need to drop 17 | let y = &mut x; 18 | *y = Box::new(1); // Deref assumes the referent is initialized, so always drops 19 | ``` 20 | 21 | This is only a problem when overwriting a previously initialized variable or 22 | one of its subfields. 23 | 24 | It turns out that Rust actually tracks whether a type should be dropped or not 25 | *at runtime*. As a variable becomes initialized and uninitialized, a *drop flag* 26 | for that variable is toggled. When a variable might need to be dropped, this 27 | flag is evaluated to determine if it should be dropped. 28 | 29 | Of course, it is often the case that a value's initialization state can be 30 | statically known at every point in the program. If this is the case, then the 31 | compiler can theoretically generate more efficient code! For instance, straight- 32 | line code has such *static drop semantics*: 33 | 34 | ```rust 35 | let mut x = Box::new(0); // x was uninit; just overwrite. 36 | let mut y = x; // y was uninit; just overwrite and make x uninit. 37 | x = Box::new(0); // x was uninit; just overwrite. 38 | y = x; // y was init; Drop y, overwrite it, and make x uninit! 39 | // y goes out of scope; y was init; Drop y! 40 | // x goes out of scope; x was uninit; do nothing. 41 | ``` 42 | 43 | Similarly, branched code where all branches have the same behavior with respect 44 | to initialization has static drop semantics: 45 | 46 | ```rust 47 | # let condition = true; 48 | let mut x = Box::new(0); // x was uninit; just overwrite. 49 | if condition { 50 | drop(x) // x gets moved out; make x uninit. 51 | } else { 52 | println!("{}", x); 53 | drop(x) // x gets moved out; make x uninit. 54 | } 55 | x = Box::new(0); // x was uninit; just overwrite. 56 | // x goes out of scope; x was init; Drop x! 57 | ``` 58 | 59 | However code like this *requires* runtime information to correctly Drop: 60 | 61 | ```rust 62 | # let condition = true; 63 | let x; 64 | if condition { 65 | x = Box::new(0); // x was uninit; just overwrite. 66 | println!("{}", x); 67 | } 68 | // x goes out of scope; x might be uninit; 69 | // check the flag! 70 | ``` 71 | 72 | Of course, in this case it's trivial to retrieve static drop semantics: 73 | 74 | ```rust 75 | # let condition = true; 76 | if condition { 77 | let x = Box::new(0); 78 | println!("{}", x); 79 | } 80 | ``` 81 | 82 | The drop flags are tracked on the stack. 83 | In old Rust versions, drop flags were stashed in a hidden field of types that implement `Drop`. 84 | -------------------------------------------------------------------------------- /src/meet-safe-and-unsafe.md: -------------------------------------------------------------------------------- 1 | # Meet Safe and Unsafe 2 | 3 | ![safe and unsafe](img/safeandunsafe.svg) 4 | 5 | It would be great to not have to worry about low-level implementation details. 6 | Who could possibly care how much space the empty tuple occupies? Sadly, it 7 | sometimes matters and we need to worry about it. The most common reason 8 | developers start to care about implementation details is performance, but more 9 | importantly, these details can become a matter of correctness when interfacing 10 | directly with hardware, operating systems, or other languages. 11 | 12 | When implementation details start to matter in a safe programming language, 13 | programmers usually have three options: 14 | 15 | * fiddle with the code to encourage the compiler/runtime to perform an optimization 16 | * adopt a more unidiomatic or cumbersome design to get the desired implementation 17 | * rewrite the implementation in a language that lets you deal with those details 18 | 19 | For that last option, the language programmers tend to use is *C*. This is often 20 | necessary to interface with systems that only declare a C interface. 21 | 22 | Unfortunately, C is incredibly unsafe to use (sometimes for good reason), 23 | and this unsafety is magnified when trying to interoperate with another 24 | language. Care must be taken to ensure C and the other language agree on 25 | what's happening, and that they don't step on each other's toes. 26 | 27 | So what does this have to do with Rust? 28 | 29 | Well, unlike C, Rust is a safe programming language. 30 | 31 | But, like C, Rust is an unsafe programming language. 32 | 33 | More accurately, Rust *contains* both a safe and unsafe programming language. 34 | 35 | Rust can be thought of as a combination of two programming languages: *Safe 36 | Rust* and *Unsafe Rust*. Conveniently, these names mean exactly what they say: 37 | Safe Rust is Safe. Unsafe Rust is, well, not. In fact, Unsafe Rust lets us 38 | do some *really* unsafe things. Things the Rust authors will implore you not to 39 | do, but we'll do anyway. 40 | 41 | Safe Rust is the *true* Rust programming language. If all you do is write Safe 42 | Rust, you will never have to worry about type-safety or memory-safety. You will 43 | never endure a dangling pointer, a use-after-free, or any other kind of 44 | Undefined Behavior (a.k.a. UB). 45 | 46 | The standard library also gives you enough utilities out of the box that you'll 47 | be able to write high-performance applications and libraries in pure idiomatic 48 | Safe Rust. 49 | 50 | But maybe you want to talk to another language. Maybe you're writing a 51 | low-level abstraction not exposed by the standard library. Maybe you're 52 | *writing* the standard library (which is written entirely in Rust). Maybe you 53 | need to do something the type-system doesn't understand and just *frob some dang 54 | bits*. Maybe you need Unsafe Rust. 55 | 56 | Unsafe Rust is exactly like Safe Rust with all the same rules and semantics. 57 | It just lets you do some *extra* things that are Definitely Not Safe 58 | (which we will define in the next section). 59 | 60 | The value of this separation is that we gain the benefits of using an unsafe 61 | language like C — low level control over implementation details — without most 62 | of the problems that come with trying to integrate it with a completely 63 | different safe language. 64 | 65 | There are still some problems — most notably, we must become aware of properties 66 | that the type system assumes and audit them in any code that interacts with 67 | Unsafe Rust. That's the purpose of this book: to teach you about these assumptions 68 | and how to manage them. 69 | -------------------------------------------------------------------------------- /src/arc-mutex/arc-clone.md: -------------------------------------------------------------------------------- 1 | # Cloning 2 | 3 | Now that we've got some basic code set up, we'll need a way to clone the `Arc`. 4 | 5 | Basically, we need to: 6 | 7 | 1. Increment the atomic reference count 8 | 2. Construct a new instance of the `Arc` from the inner pointer 9 | 10 | First, we need to get access to the `ArcInner`: 11 | 12 | 13 | ```rust,ignore 14 | let inner = unsafe { self.ptr.as_ref() }; 15 | ``` 16 | 17 | We can update the atomic reference count as follows: 18 | 19 | 20 | ```rust,ignore 21 | let old_rc = inner.rc.fetch_add(1, Ordering::???); 22 | ``` 23 | 24 | But what ordering should we use here? We don't really have any code that will 25 | need atomic synchronization when cloning, as we do not modify the internal value 26 | while cloning. Thus, we can use a Relaxed ordering here, which implies no 27 | happens-before relationship but is atomic. When `Drop`ping the Arc, however, 28 | we'll need to atomically synchronize when decrementing the reference count. This 29 | is described more in [the section on the `Drop` implementation for 30 | `Arc`](arc-drop.md). For more information on atomic relationships and Relaxed 31 | ordering, see [the section on atomics](../atomics.md). 32 | 33 | Thus, the code becomes this: 34 | 35 | 36 | ```rust,ignore 37 | let old_rc = inner.rc.fetch_add(1, Ordering::Relaxed); 38 | ``` 39 | 40 | We'll need to add another import to use `Ordering`: 41 | 42 | ```rust 43 | use std::sync::atomic::Ordering; 44 | ``` 45 | 46 | However, we have one problem with this implementation right now. What if someone 47 | decides to `mem::forget` a bunch of Arcs? The code we have written so far (and 48 | will write) assumes that the reference count accurately portrays how many Arcs 49 | are in memory, but with `mem::forget` this is false. Thus, when more and more 50 | Arcs are cloned from this one without them being `Drop`ped and the reference 51 | count being decremented, we can overflow! This will cause use-after-free which 52 | is **INCREDIBLY BAD!** 53 | 54 | To handle this, we need to check that the reference count does not go over some 55 | arbitrary value (below `usize::MAX`, as we're storing the reference count as an 56 | `AtomicUsize`), and do *something*. 57 | 58 | The standard library's implementation decides to just abort the program (as it 59 | is an incredibly unlikely case in normal code and if it happens, the program is 60 | probably incredibly degenerate) if the reference count reaches `isize::MAX` 61 | (about half of `usize::MAX`) on any thread, on the assumption that there are 62 | probably not about 2 billion threads (or about **9 quintillion** on some 64-bit 63 | machines) incrementing the reference count at once. This is what we'll do. 64 | 65 | It's pretty simple to implement this behavior: 66 | 67 | 68 | ```rust,ignore 69 | if old_rc >= isize::MAX as usize { 70 | std::process::abort(); 71 | } 72 | ``` 73 | 74 | Then, we need to return a new instance of the `Arc`: 75 | 76 | 77 | ```rust,ignore 78 | Self { 79 | ptr: self.ptr, 80 | phantom: PhantomData 81 | } 82 | ``` 83 | 84 | Now, let's wrap this all up inside the `Clone` implementation: 85 | 86 | 87 | ```rust,ignore 88 | use std::sync::atomic::Ordering; 89 | 90 | impl Clone for Arc { 91 | fn clone(&self) -> Arc { 92 | let inner = unsafe { self.ptr.as_ref() }; 93 | // Using a relaxed ordering is alright here as we don't need any atomic 94 | // synchronization here as we're not modifying or accessing the inner 95 | // data. 96 | let old_rc = inner.rc.fetch_add(1, Ordering::Relaxed); 97 | 98 | if old_rc >= isize::MAX as usize { 99 | std::process::abort(); 100 | } 101 | 102 | Self { 103 | ptr: self.ptr, 104 | phantom: PhantomData, 105 | } 106 | } 107 | } 108 | ``` 109 | -------------------------------------------------------------------------------- /src/lifetime-mismatch.md: -------------------------------------------------------------------------------- 1 | # Limits of Lifetimes 2 | 3 | Given the following code: 4 | 5 | ```rust,compile_fail 6 | #[derive(Debug)] 7 | struct Foo; 8 | 9 | impl Foo { 10 | fn mutate_and_share(&mut self) -> &Self { &*self } 11 | fn share(&self) {} 12 | } 13 | 14 | fn main() { 15 | let mut foo = Foo; 16 | let loan = foo.mutate_and_share(); 17 | foo.share(); 18 | println!("{:?}", loan); 19 | } 20 | ``` 21 | 22 | One might expect it to compile. We call `mutate_and_share`, which mutably 23 | borrows `foo` temporarily, but then returns only a shared reference. Therefore 24 | we would expect `foo.share()` to succeed as `foo` shouldn't be mutably borrowed. 25 | 26 | However when we try to compile it: 27 | 28 | ```text 29 | error[E0502]: cannot borrow `foo` as immutable because it is also borrowed as mutable 30 | --> src/main.rs:12:5 31 | | 32 | 11 | let loan = foo.mutate_and_share(); 33 | | --- mutable borrow occurs here 34 | 12 | foo.share(); 35 | | ^^^ immutable borrow occurs here 36 | 13 | println!("{:?}", loan); 37 | ``` 38 | 39 | What happened? Well, we got the exact same reasoning as we did for 40 | [Example 2 in the previous section][ex2]. We desugar the program and we get 41 | the following: 42 | 43 | 44 | ```rust,ignore 45 | struct Foo; 46 | 47 | impl Foo { 48 | fn mutate_and_share<'a>(&'a mut self) -> &'a Self { &'a *self } 49 | fn share<'a>(&'a self) {} 50 | } 51 | 52 | fn main() { 53 | 'b: { 54 | let mut foo: Foo = Foo; 55 | 'c: { 56 | let loan: &'c Foo = Foo::mutate_and_share::<'c>(&'c mut foo); 57 | 'd: { 58 | Foo::share::<'d>(&'d foo); 59 | } 60 | println!("{:?}", loan); 61 | } 62 | } 63 | } 64 | ``` 65 | 66 | The lifetime system is forced to extend the `&mut foo` to have lifetime `'c`, 67 | due to the lifetime of `loan` and `mutate_and_share`'s signature. Then when we 68 | try to call `share`, it sees we're trying to alias that `&'c mut foo` and 69 | blows up in our face! 70 | 71 | This program is clearly correct according to the reference semantics we actually 72 | care about, but the lifetime system is too coarse-grained to handle that. 73 | 74 | ## Improperly reduced borrows 75 | 76 | The following code fails to compile, because Rust sees that a variable, `map`, 77 | is borrowed twice, and can not infer that the first borrow ceases to be needed 78 | before the second one occurs. This is caused by Rust conservatively falling back 79 | to using a whole scope for the first borrow. This will eventually get fixed. 80 | 81 | ```rust,compile_fail 82 | # use std::collections::HashMap; 83 | # use std::hash::Hash; 84 | fn get_default<'m, K, V>(map: &'m mut HashMap, key: K) -> &'m mut V 85 | where 86 | K: Clone + Eq + Hash, 87 | V: Default, 88 | { 89 | match map.get_mut(&key) { 90 | Some(value) => value, 91 | None => { 92 | map.insert(key.clone(), V::default()); 93 | map.get_mut(&key).unwrap() 94 | } 95 | } 96 | } 97 | ``` 98 | 99 | Because of the lifetime restrictions imposed, `&mut map`'s lifetime 100 | overlaps other mutable borrows, resulting in a compile error: 101 | 102 | ```text 103 | error[E0499]: cannot borrow `*map` as mutable more than once at a time 104 | --> src/main.rs:12:13 105 | | 106 | 4 | fn get_default<'m, K, V>(map: &'m mut HashMap, key: K) -> &'m mut V 107 | | -- lifetime `'m` defined here 108 | ... 109 | 9 | match map.get_mut(&key) { 110 | | - --- first mutable borrow occurs here 111 | | _____| 112 | | | 113 | 10 | | Some(value) => value, 114 | 11 | | None => { 115 | 12 | | map.insert(key.clone(), V::default()); 116 | | | ^^^ second mutable borrow occurs here 117 | 13 | | map.get_mut(&key).unwrap() 118 | 14 | | } 119 | 15 | | } 120 | | |_____- returning this value requires that `*map` is borrowed for `'m` 121 | ``` 122 | 123 | [ex2]: lifetimes.html#example-aliasing-a-mutable-reference 124 | -------------------------------------------------------------------------------- /src/races.md: -------------------------------------------------------------------------------- 1 | # Data Races and Race Conditions 2 | 3 | Safe Rust guarantees an absence of data races, which are defined as: 4 | 5 | * two or more threads concurrently accessing a location of memory 6 | * one or more of them is a write 7 | * one or more of them is unsynchronized 8 | 9 | A data race has Undefined Behavior, and is therefore impossible to perform in 10 | Safe Rust. Data races are prevented *mostly* through Rust's ownership system alone: 11 | it's impossible to alias a mutable reference, so it's impossible to perform a 12 | data race. Interior mutability makes this more complicated, which is largely why 13 | we have the Send and Sync traits (see the next section for more on this). 14 | 15 | **However Rust does not prevent general race conditions.** 16 | 17 | This is mathematically impossible in situations where you do not control the 18 | scheduler, which is true for the normal OS environment. If you do control 19 | preemption, it _can be_ possible to prevent general races - this technique is 20 | used by frameworks such as [RTIC](https://github.com/rtic-rs/rtic). However, 21 | actually having control over scheduling is a very uncommon case. 22 | 23 | For this reason, it is considered "safe" for Rust to get deadlocked or do 24 | something nonsensical with incorrect synchronization: this is known as a general 25 | race condition or resource race. Obviously such a program isn't very good, but 26 | Rust of course cannot prevent all logic errors. 27 | 28 | In any case, a race condition cannot violate memory safety in a Rust program on 29 | its own. Only in conjunction with some other unsafe code can a race condition 30 | actually violate memory safety. For instance, a correct program looks like this: 31 | 32 | ```rust,no_run 33 | use std::thread; 34 | use std::sync::atomic::{AtomicUsize, Ordering}; 35 | use std::sync::Arc; 36 | 37 | let data = vec![1, 2, 3, 4]; 38 | // Arc so that the memory the AtomicUsize is stored in still exists for 39 | // the other thread to increment, even if we completely finish executing 40 | // before it. Rust won't compile the program without it, because of the 41 | // lifetime requirements of thread::spawn! 42 | let idx = Arc::new(AtomicUsize::new(0)); 43 | let other_idx = idx.clone(); 44 | 45 | // `move` captures other_idx by-value, moving it into this thread 46 | thread::spawn(move || { 47 | // It's ok to mutate idx because this value 48 | // is an atomic, so it can't cause a Data Race. 49 | other_idx.fetch_add(10, Ordering::SeqCst); 50 | }); 51 | 52 | // Index with the value loaded from the atomic. This is safe because we 53 | // read the atomic memory only once, and then pass a copy of that value 54 | // to the Vec's indexing implementation. This indexing will be correctly 55 | // bounds checked, and there's no chance of the value getting changed 56 | // in the middle. However our program may panic if the thread we spawned 57 | // managed to increment before this ran. A race condition because correct 58 | // program execution (panicking is rarely correct) depends on order of 59 | // thread execution. 60 | println!("{}", data[idx.load(Ordering::SeqCst)]); 61 | ``` 62 | 63 | We can cause a race condition to violate memory safety if we instead do the bound 64 | check in advance, and then unsafely access the data with an unchecked value: 65 | 66 | ```rust,no_run 67 | use std::thread; 68 | use std::sync::atomic::{AtomicUsize, Ordering}; 69 | use std::sync::Arc; 70 | 71 | let data = vec![1, 2, 3, 4]; 72 | 73 | let idx = Arc::new(AtomicUsize::new(0)); 74 | let other_idx = idx.clone(); 75 | 76 | // `move` captures other_idx by-value, moving it into this thread 77 | thread::spawn(move || { 78 | // It's ok to mutate idx because this value 79 | // is an atomic, so it can't cause a Data Race. 80 | other_idx.fetch_add(10, Ordering::SeqCst); 81 | }); 82 | 83 | if idx.load(Ordering::SeqCst) < data.len() { 84 | unsafe { 85 | // Incorrectly loading the idx after we did the bounds check. 86 | // It could have changed. This is a race condition, *and dangerous* 87 | // because we decided to do `get_unchecked`, which is `unsafe`. 88 | println!("{}", data.get_unchecked(idx.load(Ordering::SeqCst))); 89 | } 90 | } 91 | ``` 92 | -------------------------------------------------------------------------------- /src/arc-mutex/arc-drop.md: -------------------------------------------------------------------------------- 1 | # Dropping 2 | 3 | We now need a way to decrease the reference count and drop the data once it is 4 | low enough, otherwise the data will live forever on the heap. 5 | 6 | To do this, we can implement `Drop`. 7 | 8 | Basically, we need to: 9 | 10 | 1. Decrement the reference count 11 | 2. If there is only one reference remaining to the data, then: 12 | 3. Atomically fence the data to prevent reordering of the use and deletion of 13 | the data 14 | 4. Drop the inner data 15 | 16 | First, we'll need to get access to the `ArcInner`: 17 | 18 | 19 | ```rust,ignore 20 | let inner = unsafe { self.ptr.as_ref() }; 21 | ``` 22 | 23 | Now, we need to decrement the reference count. To streamline our code, we can 24 | also return if the returned value from `fetch_sub` (the value of the reference 25 | count before decrementing it) is not equal to `1` (which happens when we are not 26 | the last reference to the data). 27 | 28 | 29 | ```rust,ignore 30 | if inner.rc.fetch_sub(1, Ordering::Release) != 1 { 31 | return; 32 | } 33 | ``` 34 | 35 | We then need to create an atomic fence to prevent reordering of the use of the 36 | data and deletion of the data. As described in [the standard library's 37 | implementation of `Arc`][3]: 38 | > This fence is needed to prevent reordering of use of the data and deletion of 39 | > the data. Because it is marked `Release`, the decreasing of the reference 40 | > count synchronizes with this `Acquire` fence. This means that use of the data 41 | > happens before decreasing the reference count, which happens before this 42 | > fence, which happens before the deletion of the data. 43 | > 44 | > As explained in the [Boost documentation][1], 45 | > 46 | > > It is important to enforce any possible access to the object in one 47 | > > thread (through an existing reference) to *happen before* deleting 48 | > > the object in a different thread. This is achieved by a "release" 49 | > > operation after dropping a reference (any access to the object 50 | > > through this reference must obviously happened before), and an 51 | > > "acquire" operation before deleting the object. 52 | > 53 | > In particular, while the contents of an Arc are usually immutable, it's 54 | > possible to have interior writes to something like a `Mutex`. Since a Mutex 55 | > is not acquired when it is deleted, we can't rely on its synchronization logic 56 | > to make writes in thread A visible to a destructor running in thread B. 57 | > 58 | > Also note that the Acquire fence here could probably be replaced with an 59 | > Acquire load, which could improve performance in highly-contended situations. 60 | > See [2]. 61 | > 62 | > [1]: https://www.boost.org/doc/libs/1_55_0/doc/html/atomic/usage_examples.html 63 | > [2]: https://github.com/rust-lang/rust/pull/41714 64 | [3]: https://github.com/rust-lang/rust/blob/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/alloc/src/sync.rs#L1440-L1467 65 | 66 | To do this, we do the following: 67 | 68 | ```rust 69 | # use std::sync::atomic::Ordering; 70 | use std::sync::atomic; 71 | atomic::fence(Ordering::Acquire); 72 | ``` 73 | 74 | Finally, we can drop the data itself. We use `Box::from_raw` to drop the boxed 75 | `ArcInner` and its data. This takes a `*mut T` and not a `NonNull`, so we 76 | must convert using `NonNull::as_ptr`. 77 | 78 | 79 | ```rust,ignore 80 | unsafe { Box::from_raw(self.ptr.as_ptr()); } 81 | ``` 82 | 83 | This is safe as we know we have the last pointer to the `ArcInner` and that its 84 | pointer is valid. 85 | 86 | Now, let's wrap this all up inside the `Drop` implementation: 87 | 88 | 89 | ```rust,ignore 90 | impl Drop for Arc { 91 | fn drop(&mut self) { 92 | let inner = unsafe { self.ptr.as_ref() }; 93 | if inner.rc.fetch_sub(1, Ordering::Release) != 1 { 94 | return; 95 | } 96 | // This fence is needed to prevent reordering of the use and deletion 97 | // of the data. 98 | atomic::fence(Ordering::Acquire); 99 | // This is safe as we know we have the last pointer to the `ArcInner` 100 | // and that its pointer is valid. 101 | unsafe { Box::from_raw(self.ptr.as_ptr()); } 102 | } 103 | } 104 | ``` 105 | -------------------------------------------------------------------------------- /src/working-with-unsafe.md: -------------------------------------------------------------------------------- 1 | # Working with Unsafe 2 | 3 | Rust generally only gives us the tools to talk about Unsafe Rust in a scoped and 4 | binary manner. Unfortunately, reality is significantly more complicated than 5 | that. For instance, consider the following toy function: 6 | 7 | ```rust 8 | fn index(idx: usize, arr: &[u8]) -> Option { 9 | if idx < arr.len() { 10 | unsafe { 11 | Some(*arr.get_unchecked(idx)) 12 | } 13 | } else { 14 | None 15 | } 16 | } 17 | ``` 18 | 19 | This function is safe and correct. We check that the index is in bounds, and if 20 | it is, index into the array in an unchecked manner. We say that such a correct 21 | unsafely implemented function is *sound*, meaning that safe code cannot cause 22 | Undefined Behavior through it (which, remember, is the single fundamental 23 | property of Safe Rust). 24 | 25 | But even in such a trivial function, the scope of the unsafe block is 26 | questionable. Consider changing the `<` to a `<=`: 27 | 28 | ```rust 29 | fn index(idx: usize, arr: &[u8]) -> Option { 30 | if idx <= arr.len() { 31 | unsafe { 32 | Some(*arr.get_unchecked(idx)) 33 | } 34 | } else { 35 | None 36 | } 37 | } 38 | ``` 39 | 40 | This program is now *unsound*, Safe Rust can cause Undefined Behavior, and yet 41 | *we only modified safe code*. This is the fundamental problem of safety: it's 42 | non-local. The soundness of our unsafe operations necessarily depends on the 43 | state established by otherwise "safe" operations. 44 | 45 | Safety is modular in the sense that opting into unsafety doesn't require you 46 | to consider arbitrary other kinds of badness. For instance, doing an unchecked 47 | index into a slice doesn't mean you suddenly need to worry about the slice being 48 | null or containing uninitialized memory. Nothing fundamentally changes. However 49 | safety *isn't* modular in the sense that programs are inherently stateful and 50 | your unsafe operations may depend on arbitrary other state. 51 | 52 | This non-locality gets much worse when we incorporate actual persistent state. 53 | Consider a simple implementation of `Vec`: 54 | 55 | ```rust 56 | use std::ptr; 57 | 58 | // Note: This definition is naive. See the chapter on implementing Vec. 59 | pub struct Vec { 60 | ptr: *mut T, 61 | len: usize, 62 | cap: usize, 63 | } 64 | 65 | // Note this implementation does not correctly handle zero-sized types. 66 | // See the chapter on implementing Vec. 67 | impl Vec { 68 | pub fn push(&mut self, elem: T) { 69 | if self.len == self.cap { 70 | // not important for this example 71 | self.reallocate(); 72 | } 73 | unsafe { 74 | ptr::write(self.ptr.add(self.len), elem); 75 | self.len += 1; 76 | } 77 | } 78 | # fn reallocate(&mut self) { } 79 | } 80 | 81 | # fn main() {} 82 | ``` 83 | 84 | This code is simple enough to reasonably audit and informally verify. Now consider 85 | adding the following method: 86 | 87 | 88 | ```rust,ignore 89 | fn make_room(&mut self) { 90 | // grow the capacity 91 | self.cap += 1; 92 | } 93 | ``` 94 | 95 | This code is 100% Safe Rust but it is also completely unsound. Changing the 96 | capacity violates the invariants of Vec (that `cap` reflects the allocated space 97 | in the Vec). This is not something the rest of Vec can guard against. It *has* 98 | to trust the capacity field because there's no way to verify it. 99 | 100 | Because it relies on invariants of a struct field, this `unsafe` code 101 | does more than pollute a whole function: it pollutes a whole *module*. 102 | Generally, the only bullet-proof way to limit the scope of unsafe code is at the 103 | module boundary with privacy. 104 | 105 | However this works *perfectly*. The existence of `make_room` is *not* a 106 | problem for the soundness of Vec because we didn't mark it as public. Only the 107 | module that defines this function can call it. Also, `make_room` directly 108 | accesses the private fields of Vec, so it can only be written in the same module 109 | as Vec. 110 | 111 | It is therefore possible for us to write a completely safe abstraction that 112 | relies on complex invariants. This is *critical* to the relationship between 113 | Safe Rust and Unsafe Rust. 114 | 115 | We have already seen that Unsafe code must trust *some* Safe code, but shouldn't 116 | trust *generic* Safe code. Privacy is important to unsafe code for similar reasons: 117 | it prevents us from having to trust all the safe code in the universe from messing 118 | with our trusted state. 119 | 120 | Safety lives! 121 | -------------------------------------------------------------------------------- /src/arc-mutex/arc-base.md: -------------------------------------------------------------------------------- 1 | # Base Code 2 | 3 | Now that we've decided the layout for our implementation of `Arc`, let's create 4 | some basic code. 5 | 6 | ## Constructing the Arc 7 | 8 | We'll first need a way to construct an `Arc`. 9 | 10 | This is pretty simple, as we just need to box the `ArcInner` and get a 11 | `NonNull` pointer to it. 12 | 13 | 14 | ```rust,ignore 15 | impl Arc { 16 | pub fn new(data: T) -> Arc { 17 | // We start the reference count at 1, as that first reference is the 18 | // current pointer. 19 | let boxed = Box::new(ArcInner { 20 | rc: AtomicUsize::new(1), 21 | data, 22 | }); 23 | Arc { 24 | // It is okay to call `.unwrap()` here as we get a pointer from 25 | // `Box::into_raw` which is guaranteed to not be null. 26 | ptr: NonNull::new(Box::into_raw(boxed)).unwrap(), 27 | phantom: PhantomData, 28 | } 29 | } 30 | } 31 | ``` 32 | 33 | ## Send and Sync 34 | 35 | Since we're building a concurrency primitive, we'll need to be able to send it 36 | across threads. Thus, we can implement the `Send` and `Sync` marker traits. For 37 | more information on these, see [the section on `Send` and 38 | `Sync`](../send-and-sync.md). 39 | 40 | This is okay because: 41 | * You can only get a mutable reference to the value inside an `Arc` if and only 42 | if it is the only `Arc` referencing that data (which only happens in `Drop`) 43 | * We use atomics for the shared mutable reference counting 44 | 45 | 46 | ```rust,ignore 47 | unsafe impl Send for Arc {} 48 | unsafe impl Sync for Arc {} 49 | ``` 50 | 51 | We need to have the bound `T: Sync + Send` because if we did not provide those 52 | bounds, it would be possible to share values that are thread-unsafe across a 53 | thread boundary via an `Arc`, which could possibly cause data races or 54 | unsoundness. 55 | 56 | For example, if those bounds were not present, `Arc>` would be `Sync` or 57 | `Send`, meaning that you could clone the `Rc` out of the `Arc` to send it across 58 | a thread (without creating an entirely new `Rc`), which would create data races 59 | as `Rc` is not thread-safe. 60 | 61 | ## Getting the `ArcInner` 62 | 63 | To dereference the `NonNull` pointer into a `&T`, we can call 64 | `NonNull::as_ref`. This is unsafe, unlike the typical `as_ref` function, so we 65 | must call it like this: 66 | 67 | 68 | ```rust,ignore 69 | unsafe { self.ptr.as_ref() } 70 | ``` 71 | 72 | We'll be using this snippet a few times in this code (usually with an associated 73 | `let` binding). 74 | 75 | This unsafety is okay because while this `Arc` is alive, we're guaranteed that 76 | the inner pointer is valid. 77 | 78 | ## Deref 79 | 80 | Alright. Now we can make `Arc`s (and soon will be able to clone and destroy them correctly), but how do we get 81 | to the data inside? 82 | 83 | What we need now is an implementation of `Deref`. 84 | 85 | We'll need to import the trait: 86 | 87 | 88 | ```rust,ignore 89 | use std::ops::Deref; 90 | ``` 91 | 92 | And here's the implementation: 93 | 94 | 95 | ```rust,ignore 96 | impl Deref for Arc { 97 | type Target = T; 98 | 99 | fn deref(&self) -> &T { 100 | let inner = unsafe { self.ptr.as_ref() }; 101 | &inner.data 102 | } 103 | } 104 | ``` 105 | 106 | Pretty simple, eh? This simply dereferences the `NonNull` pointer to the 107 | `ArcInner`, then gets a reference to the data inside. 108 | 109 | ## Code 110 | 111 | Here's all the code from this section: 112 | 113 | 114 | ```rust,ignore 115 | use std::ops::Deref; 116 | 117 | impl Arc { 118 | pub fn new(data: T) -> Arc { 119 | // We start the reference count at 1, as that first reference is the 120 | // current pointer. 121 | let boxed = Box::new(ArcInner { 122 | rc: AtomicUsize::new(1), 123 | data, 124 | }); 125 | Arc { 126 | // It is okay to call `.unwrap()` here as we get a pointer from 127 | // `Box::into_raw` which is guaranteed to not be null. 128 | ptr: NonNull::new(Box::into_raw(boxed)).unwrap(), 129 | phantom: PhantomData, 130 | } 131 | } 132 | } 133 | 134 | unsafe impl Send for Arc {} 135 | unsafe impl Sync for Arc {} 136 | 137 | 138 | impl Deref for Arc { 139 | type Target = T; 140 | 141 | fn deref(&self) -> &T { 142 | let inner = unsafe { self.ptr.as_ref() }; 143 | &inner.data 144 | } 145 | } 146 | ``` 147 | -------------------------------------------------------------------------------- /src/vec/vec-raw.md: -------------------------------------------------------------------------------- 1 | # RawVec 2 | 3 | We've actually reached an interesting situation here: we've duplicated the logic 4 | for specifying a buffer and freeing its memory in Vec and IntoIter. Now that 5 | we've implemented it and identified *actual* logic duplication, this is a good 6 | time to perform some logic compression. 7 | 8 | We're going to abstract out the `(ptr, cap)` pair and give them the logic for 9 | allocating, growing, and freeing: 10 | 11 | 12 | ```rust,ignore 13 | struct RawVec { 14 | ptr: NonNull, 15 | cap: usize, 16 | } 17 | 18 | unsafe impl Send for RawVec {} 19 | unsafe impl Sync for RawVec {} 20 | 21 | impl RawVec { 22 | fn new() -> Self { 23 | assert!(mem::size_of::() != 0, "TODO: implement ZST support"); 24 | RawVec { 25 | ptr: NonNull::dangling(), 26 | cap: 0, 27 | } 28 | } 29 | 30 | fn grow(&mut self) { 31 | // This can't overflow because we ensure self.cap <= isize::MAX. 32 | let new_cap = if self.cap == 0 { 1 } else { 2 * self.cap }; 33 | 34 | // Layout::array checks that the number of bytes is <= usize::MAX, 35 | // but this is redundant since old_layout.size() <= isize::MAX, 36 | // so the `unwrap` should never fail. 37 | let new_layout = Layout::array::(new_cap).unwrap(); 38 | 39 | // Ensure that the new allocation doesn't exceed `isize::MAX` bytes. 40 | assert!(new_layout.size() <= isize::MAX as usize, "Allocation too large"); 41 | 42 | let new_ptr = if self.cap == 0 { 43 | unsafe { alloc::alloc(new_layout) } 44 | } else { 45 | let old_layout = Layout::array::(self.cap).unwrap(); 46 | let old_ptr = self.ptr.as_ptr() as *mut u8; 47 | unsafe { alloc::realloc(old_ptr, old_layout, new_layout.size()) } 48 | }; 49 | 50 | // If allocation fails, `new_ptr` will be null, in which case we abort. 51 | self.ptr = match NonNull::new(new_ptr as *mut T) { 52 | Some(p) => p, 53 | None => alloc::handle_alloc_error(new_layout), 54 | }; 55 | self.cap = new_cap; 56 | } 57 | } 58 | 59 | impl Drop for RawVec { 60 | fn drop(&mut self) { 61 | if self.cap != 0 { 62 | let layout = Layout::array::(self.cap).unwrap(); 63 | unsafe { 64 | alloc::dealloc(self.ptr.as_ptr() as *mut u8, layout); 65 | } 66 | } 67 | } 68 | } 69 | ``` 70 | 71 | And change Vec as follows: 72 | 73 | 74 | ```rust,ignore 75 | pub struct Vec { 76 | buf: RawVec, 77 | len: usize, 78 | } 79 | 80 | impl Vec { 81 | fn ptr(&self) -> *mut T { 82 | self.buf.ptr.as_ptr() 83 | } 84 | 85 | fn cap(&self) -> usize { 86 | self.buf.cap 87 | } 88 | 89 | pub fn new() -> Self { 90 | Vec { 91 | buf: RawVec::new(), 92 | len: 0, 93 | } 94 | } 95 | 96 | // push/pop/insert/remove largely unchanged: 97 | // * `self.ptr.as_ptr() -> self.ptr()` 98 | // * `self.cap -> self.cap()` 99 | // * `self.grow() -> self.buf.grow()` 100 | } 101 | 102 | impl Drop for Vec { 103 | fn drop(&mut self) { 104 | while let Some(_) = self.pop() {} 105 | // deallocation is handled by RawVec 106 | } 107 | } 108 | ``` 109 | 110 | And finally we can really simplify IntoIter: 111 | 112 | 113 | ```rust,ignore 114 | pub struct IntoIter { 115 | _buf: RawVec, // we don't actually care about this. Just need it to live. 116 | start: *const T, 117 | end: *const T, 118 | } 119 | 120 | // next and next_back literally unchanged since they never referred to the buf 121 | 122 | impl Drop for IntoIter { 123 | fn drop(&mut self) { 124 | // only need to ensure all our elements are read; 125 | // buffer will clean itself up afterwards. 126 | for _ in &mut *self {} 127 | } 128 | } 129 | 130 | impl IntoIterator for Vec { 131 | type Item = T; 132 | type IntoIter = IntoIter; 133 | fn into_iter(self) -> IntoIter { 134 | // need to use ptr::read to unsafely move the buf out since it's 135 | // not Copy, and Vec implements Drop (so we can't destructure it). 136 | let buf = unsafe { ptr::read(&self.buf) }; 137 | let len = self.len; 138 | mem::forget(self); 139 | 140 | IntoIter { 141 | start: buf.ptr.as_ptr(), 142 | end: if buf.cap == 0 { 143 | // can't offset off of a pointer unless it's part of an allocation 144 | buf.ptr.as_ptr() 145 | } else { 146 | unsafe { buf.ptr.as_ptr().add(len) } 147 | }, 148 | _buf: buf, 149 | } 150 | } 151 | } 152 | ``` 153 | 154 | Much better. 155 | -------------------------------------------------------------------------------- /src/vec/vec-drain.md: -------------------------------------------------------------------------------- 1 | # Drain 2 | 3 | Let's move on to Drain. Drain is largely the same as IntoIter, except that 4 | instead of consuming the Vec, it borrows the Vec and leaves its allocation 5 | untouched. For now we'll only implement the "basic" full-range version. 6 | 7 | 8 | ```rust,ignore 9 | use std::marker::PhantomData; 10 | 11 | struct Drain<'a, T: 'a> { 12 | // Need to bound the lifetime here, so we do it with `&'a mut Vec` 13 | // because that's semantically what we contain. We're "just" calling 14 | // `pop()` and `remove(0)`. 15 | vec: PhantomData<&'a mut Vec>, 16 | start: *const T, 17 | end: *const T, 18 | } 19 | 20 | impl<'a, T> Iterator for Drain<'a, T> { 21 | type Item = T; 22 | fn next(&mut self) -> Option { 23 | if self.start == self.end { 24 | None 25 | ``` 26 | 27 | -- wait, this is seeming familiar. Let's do some more compression. Both 28 | IntoIter and Drain have the exact same structure, let's just factor it out. 29 | 30 | 31 | ```rust,ignore 32 | struct RawValIter { 33 | start: *const T, 34 | end: *const T, 35 | } 36 | 37 | impl RawValIter { 38 | // unsafe to construct because it has no associated lifetimes. 39 | // This is necessary to store a RawValIter in the same struct as 40 | // its actual allocation. OK since it's a private implementation 41 | // detail. 42 | unsafe fn new(slice: &[T]) -> Self { 43 | RawValIter { 44 | start: slice.as_ptr(), 45 | end: if slice.len() == 0 { 46 | // if `len = 0`, then this is not actually allocated memory. 47 | // Need to avoid offsetting because that will give wrong 48 | // information to LLVM via GEP. 49 | slice.as_ptr() 50 | } else { 51 | slice.as_ptr().add(slice.len()) 52 | } 53 | } 54 | } 55 | } 56 | 57 | // Iterator and DoubleEndedIterator impls identical to IntoIter. 58 | ``` 59 | 60 | And IntoIter becomes the following: 61 | 62 | 63 | ```rust,ignore 64 | pub struct IntoIter { 65 | _buf: RawVec, // we don't actually care about this. Just need it to live. 66 | iter: RawValIter, 67 | } 68 | 69 | impl Iterator for IntoIter { 70 | type Item = T; 71 | fn next(&mut self) -> Option { self.iter.next() } 72 | fn size_hint(&self) -> (usize, Option) { self.iter.size_hint() } 73 | } 74 | 75 | impl DoubleEndedIterator for IntoIter { 76 | fn next_back(&mut self) -> Option { self.iter.next_back() } 77 | } 78 | 79 | impl Drop for IntoIter { 80 | fn drop(&mut self) { 81 | for _ in &mut *self {} 82 | } 83 | } 84 | 85 | impl IntoIterator for Vec { 86 | type Item = T; 87 | type IntoIter = IntoIter; 88 | fn into_iter(self) -> IntoIter { 89 | unsafe { 90 | let iter = RawValIter::new(&self); 91 | 92 | let buf = ptr::read(&self.buf); 93 | mem::forget(self); 94 | 95 | IntoIter { 96 | iter, 97 | _buf: buf, 98 | } 99 | } 100 | } 101 | } 102 | ``` 103 | 104 | Note that I've left a few quirks in this design to make upgrading Drain to work 105 | with arbitrary subranges a bit easier. In particular we *could* have RawValIter 106 | drain itself on drop, but that won't work right for a more complex Drain. 107 | We also take a slice to simplify Drain initialization. 108 | 109 | Alright, now Drain is really easy: 110 | 111 | 112 | ```rust,ignore 113 | use std::marker::PhantomData; 114 | 115 | pub struct Drain<'a, T: 'a> { 116 | vec: PhantomData<&'a mut Vec>, 117 | iter: RawValIter, 118 | } 119 | 120 | impl<'a, T> Iterator for Drain<'a, T> { 121 | type Item = T; 122 | fn next(&mut self) -> Option { self.iter.next() } 123 | fn size_hint(&self) -> (usize, Option) { self.iter.size_hint() } 124 | } 125 | 126 | impl<'a, T> DoubleEndedIterator for Drain<'a, T> { 127 | fn next_back(&mut self) -> Option { self.iter.next_back() } 128 | } 129 | 130 | impl<'a, T> Drop for Drain<'a, T> { 131 | fn drop(&mut self) { 132 | for _ in &mut *self {} 133 | } 134 | } 135 | 136 | impl Vec { 137 | pub fn drain(&mut self) -> Drain { 138 | let iter = unsafe { RawValIter::new(&self) }; 139 | 140 | // this is a mem::forget safety thing. If Drain is forgotten, we just 141 | // leak the whole Vec's contents. Also we need to do this *eventually* 142 | // anyway, so why not do it now? 143 | self.len = 0; 144 | 145 | Drain { 146 | iter, 147 | vec: PhantomData, 148 | } 149 | } 150 | } 151 | ``` 152 | 153 | For more details on the `mem::forget` problem, see the 154 | [section on leaks][leaks]. 155 | 156 | [leaks]: ../leaking.html 157 | -------------------------------------------------------------------------------- /src/vec/vec-into-iter.md: -------------------------------------------------------------------------------- 1 | # IntoIter 2 | 3 | Let's move on to writing iterators. `iter` and `iter_mut` have already been 4 | written for us thanks to The Magic of Deref. However there's two interesting 5 | iterators that Vec provides that slices can't: `into_iter` and `drain`. 6 | 7 | IntoIter consumes the Vec by-value, and can consequently yield its elements 8 | by-value. In order to enable this, IntoIter needs to take control of Vec's 9 | allocation. 10 | 11 | IntoIter needs to be DoubleEnded as well, to enable reading from both ends. 12 | Reading from the back could just be implemented as calling `pop`, but reading 13 | from the front is harder. We could call `remove(0)` but that would be insanely 14 | expensive. Instead we're going to just use ptr::read to copy values out of 15 | either end of the Vec without mutating the buffer at all. 16 | 17 | To do this we're going to use a very common C idiom for array iteration. We'll 18 | make two pointers; one that points to the start of the array, and one that 19 | points to one-element past the end. When we want an element from one end, we'll 20 | read out the value pointed to at that end and move the pointer over by one. When 21 | the two pointers are equal, we know we're done. 22 | 23 | Note that the order of read and offset are reversed for `next` and `next_back` 24 | For `next_back` the pointer is always after the element it wants to read next, 25 | while for `next` the pointer is always at the element it wants to read next. 26 | To see why this is, consider the case where every element but one has been 27 | yielded. 28 | 29 | The array looks like this: 30 | 31 | ```text 32 | S E 33 | [X, X, X, O, X, X, X] 34 | ``` 35 | 36 | If E pointed directly at the element it wanted to yield next, it would be 37 | indistinguishable from the case where there are no more elements to yield. 38 | 39 | Although we don't actually care about it during iteration, we also need to hold 40 | onto the Vec's allocation information in order to free it once IntoIter is 41 | dropped. 42 | 43 | So we're going to use the following struct: 44 | 45 | 46 | ```rust,ignore 47 | pub struct IntoIter { 48 | buf: NonNull, 49 | cap: usize, 50 | start: *const T, 51 | end: *const T, 52 | } 53 | ``` 54 | 55 | And this is what we end up with for initialization: 56 | 57 | 58 | ```rust,ignore 59 | impl IntoIterator for Vec { 60 | type Item = T; 61 | type IntoIter = IntoIter; 62 | fn into_iter(self) -> IntoIter { 63 | // Make sure not to drop Vec since that would free the buffer 64 | let vec = ManuallyDrop::new(self); 65 | 66 | // Can't destructure Vec since it's Drop 67 | let ptr = vec.ptr; 68 | let cap = vec.cap; 69 | let len = vec.len; 70 | 71 | IntoIter { 72 | buf: ptr, 73 | cap, 74 | start: ptr.as_ptr(), 75 | end: if cap == 0 { 76 | // can't offset off this pointer, it's not allocated! 77 | ptr.as_ptr() 78 | } else { 79 | unsafe { ptr.as_ptr().add(len) } 80 | }, 81 | } 82 | } 83 | } 84 | ``` 85 | 86 | Here's iterating forward: 87 | 88 | 89 | ```rust,ignore 90 | impl Iterator for IntoIter { 91 | type Item = T; 92 | fn next(&mut self) -> Option { 93 | if self.start == self.end { 94 | None 95 | } else { 96 | unsafe { 97 | let result = ptr::read(self.start); 98 | self.start = self.start.offset(1); 99 | Some(result) 100 | } 101 | } 102 | } 103 | 104 | fn size_hint(&self) -> (usize, Option) { 105 | let len = (self.end as usize - self.start as usize) 106 | / mem::size_of::(); 107 | (len, Some(len)) 108 | } 109 | } 110 | ``` 111 | 112 | And here's iterating backwards. 113 | 114 | 115 | ```rust,ignore 116 | impl DoubleEndedIterator for IntoIter { 117 | fn next_back(&mut self) -> Option { 118 | if self.start == self.end { 119 | None 120 | } else { 121 | unsafe { 122 | self.end = self.end.offset(-1); 123 | Some(ptr::read(self.end)) 124 | } 125 | } 126 | } 127 | } 128 | ``` 129 | 130 | Because IntoIter takes ownership of its allocation, it needs to implement Drop 131 | to free it. However it also wants to implement Drop to drop any elements it 132 | contains that weren't yielded. 133 | 134 | 135 | ```rust,ignore 136 | impl Drop for IntoIter { 137 | fn drop(&mut self) { 138 | if self.cap != 0 { 139 | // drop any remaining elements 140 | for _ in &mut *self {} 141 | let layout = Layout::array::(self.cap).unwrap(); 142 | unsafe { 143 | alloc::dealloc(self.buf.as_ptr() as *mut u8, layout); 144 | } 145 | } 146 | } 147 | } 148 | ``` 149 | -------------------------------------------------------------------------------- /src/what-unsafe-does.md: -------------------------------------------------------------------------------- 1 | # What Unsafe Rust Can Do 2 | 3 | The only things that are different in Unsafe Rust are that you can: 4 | 5 | * Dereference raw pointers 6 | * Call `unsafe` functions (including C functions, compiler intrinsics, and the raw allocator) 7 | * Implement `unsafe` traits 8 | * Access or modify mutable statics 9 | * Access fields of `union`s 10 | 11 | That's it. The reason these operations are relegated to Unsafe is that misusing 12 | any of these things will cause the ever dreaded Undefined Behavior. Invoking 13 | Undefined Behavior gives the compiler full rights to do arbitrarily bad things 14 | to your program. You definitely *should not* invoke Undefined Behavior. 15 | 16 | Unlike C, Undefined Behavior is pretty limited in scope in Rust. All the core 17 | language cares about is preventing the following things: 18 | 19 | * Dereferencing (using the `*` operator on) dangling or unaligned pointers (see below) 20 | * Breaking the [pointer aliasing rules][] 21 | * Calling a function with the wrong call ABI or unwinding from a function with the wrong unwind ABI. 22 | * Causing a [data race][race] 23 | * Executing code compiled with [target features][] that the current thread of execution does 24 | not support 25 | * Producing invalid values (either alone or as a field of a compound type such 26 | as `enum`/`struct`/array/tuple): 27 | * a `bool` that isn't 0 or 1 28 | * an `enum` with an invalid discriminant 29 | * a null `fn` pointer 30 | * a `char` outside the ranges [0x0, 0xD7FF] and [0xE000, 0x10FFFF] 31 | * a `!` (all values are invalid for this type) 32 | * an integer (`i*`/`u*`), floating point value (`f*`), or raw pointer read from 33 | [uninitialized memory][], or uninitialized memory in a `str`. 34 | * a reference/`Box` that is dangling, unaligned, or points to an invalid value. 35 | * a wide reference, `Box`, or raw pointer that has invalid metadata: 36 | * `dyn Trait` metadata is invalid if it is not a pointer to a vtable for 37 | `Trait` that matches the actual dynamic trait the pointer or reference points to 38 | * slice metadata is invalid if the length is not a valid `usize` 39 | (i.e., it must not be read from uninitialized memory) 40 | * a type with custom invalid values that is one of those values, such as a 41 | [`NonNull`] that is null. (Requesting custom invalid values is an unstable 42 | feature, but some stable libstd types, like `NonNull`, make use of it.) 43 | 44 | For a more detailed explanation about "Undefined Behavior", you may refer to 45 | [the reference][behavior-considered-undefined]. 46 | 47 | "Producing" a value happens any time a value is assigned, passed to a 48 | function/primitive operation or returned from a function/primitive operation. 49 | 50 | A reference/pointer is "dangling" if it is null or not all of the bytes it 51 | points to are part of the same allocation (so in particular they all have to be 52 | part of *some* allocation). The span of bytes it points to is determined by the 53 | pointer value and the size of the pointee type. As a consequence, if the span is 54 | empty, "dangling" is the same as "null". Note that slices and strings point 55 | to their entire range, so it's important that the length metadata is never too 56 | large (in particular, allocations and therefore slices and strings cannot be 57 | bigger than `isize::MAX` bytes). If for some reason this is too cumbersome, 58 | consider using raw pointers. 59 | 60 | That's it. That's all the causes of Undefined Behavior baked into Rust. Of 61 | course, unsafe functions and traits are free to declare arbitrary other 62 | constraints that a program must maintain to avoid Undefined Behavior. For 63 | instance, the allocator APIs declare that deallocating unallocated memory is 64 | Undefined Behavior. 65 | 66 | However, violations of these constraints generally will just transitively lead to one of 67 | the above problems. Some additional constraints may also derive from compiler 68 | intrinsics that make special assumptions about how code can be optimized. For instance, 69 | Vec and Box make use of intrinsics that require their pointers to be non-null at all times. 70 | 71 | Rust is otherwise quite permissive with respect to other dubious operations. 72 | Rust considers it "safe" to: 73 | 74 | * Deadlock 75 | * Have a [race condition][race] 76 | * Leak memory 77 | * Overflow integers (with the built-in operators such as `+` etc.) 78 | * Abort the program 79 | * Delete the production database 80 | 81 | For more detailed information, you may refer to [the reference][behavior-not-considered-unsafe]. 82 | 83 | However any program that actually manages to do such a thing is *probably* 84 | incorrect. Rust provides lots of tools to make these things rare, but 85 | these problems are considered impractical to categorically prevent. 86 | 87 | [pointer aliasing rules]: references.html 88 | [uninitialized memory]: uninitialized.html 89 | [race]: races.html 90 | [target features]: ../reference/attributes/codegen.html#the-target_feature-attribute 91 | [`NonNull`]: ../std/ptr/struct.NonNull.html 92 | [behavior-considered-undefined]: ../reference/behavior-considered-undefined.html 93 | [behavior-not-considered-unsafe]: ../reference/behavior-not-considered-unsafe.html 94 | -------------------------------------------------------------------------------- /src/repr-rust.md: -------------------------------------------------------------------------------- 1 | # repr(Rust) 2 | 3 | First and foremost, all types have an alignment specified in bytes. The 4 | alignment of a type specifies what addresses are valid to store the value at. A 5 | value with alignment `n` must only be stored at an address that is a multiple of 6 | `n`. So alignment 2 means you must be stored at an even address, and 1 means 7 | that you can be stored anywhere. Alignment is at least 1, and always a power 8 | of 2. 9 | 10 | Primitives are usually aligned to their size, although this is 11 | platform-specific behavior. For example, on x86 `u64` and `f64` are often 12 | aligned to 4 bytes (32 bits). 13 | 14 | A type's size must always be a multiple of its alignment (Zero being a valid size 15 | for any alignment). This ensures that an array of that type may always be indexed 16 | by offsetting by a multiple of its size. Note that the size and alignment of a 17 | type may not be known statically in the case of [dynamically sized types][dst]. 18 | 19 | Rust gives you the following ways to lay out composite data: 20 | 21 | * structs (named product types) 22 | * tuples (anonymous product types) 23 | * arrays (homogeneous product types) 24 | * enums (named sum types -- tagged unions) 25 | * unions (untagged unions) 26 | 27 | An enum is said to be *field-less* if none of its variants have associated data. 28 | 29 | By default, composite structures have an alignment equal to the maximum 30 | of their fields' alignments. Rust will consequently insert padding where 31 | necessary to ensure that all fields are properly aligned and that the overall 32 | type's size is a multiple of its alignment. For instance: 33 | 34 | ```rust 35 | struct A { 36 | a: u8, 37 | b: u32, 38 | c: u16, 39 | } 40 | ``` 41 | 42 | will be 32-bit aligned on a target that aligns these primitives to their 43 | respective sizes. The whole struct will therefore have a size that is a multiple 44 | of 32-bits. It may become: 45 | 46 | ```rust 47 | struct A { 48 | a: u8, 49 | _pad1: [u8; 3], // to align `b` 50 | b: u32, 51 | c: u16, 52 | _pad2: [u8; 2], // to make overall size multiple of 4 53 | } 54 | ``` 55 | 56 | or maybe: 57 | 58 | ```rust 59 | struct A { 60 | b: u32, 61 | c: u16, 62 | a: u8, 63 | _pad: u8, 64 | } 65 | ``` 66 | 67 | There is *no indirection* for these types; all data is stored within the struct, 68 | as you would expect in C. However with the exception of arrays (which are 69 | densely packed and in-order), the layout of data is not specified by default. 70 | Given the two following struct definitions: 71 | 72 | ```rust 73 | struct A { 74 | a: i32, 75 | b: u64, 76 | } 77 | 78 | struct B { 79 | a: i32, 80 | b: u64, 81 | } 82 | ``` 83 | 84 | Rust *does* guarantee that two instances of A have their data laid out in 85 | exactly the same way. However Rust *does not* currently guarantee that an 86 | instance of A has the same field ordering or padding as an instance of B. 87 | 88 | With A and B as written, this point would seem to be pedantic, but several other 89 | features of Rust make it desirable for the language to play with data layout in 90 | complex ways. 91 | 92 | For instance, consider this struct: 93 | 94 | ```rust 95 | struct Foo { 96 | count: u16, 97 | data1: T, 98 | data2: U, 99 | } 100 | ``` 101 | 102 | Now consider the monomorphizations of `Foo` and `Foo`. If 103 | Rust lays out the fields in the order specified, we expect it to pad the 104 | values in the struct to satisfy their alignment requirements. So if Rust 105 | didn't reorder fields, we would expect it to produce the following: 106 | 107 | 108 | ```rust,ignore 109 | struct Foo { 110 | count: u16, 111 | data1: u16, 112 | data2: u32, 113 | } 114 | 115 | struct Foo { 116 | count: u16, 117 | _pad1: u16, 118 | data1: u32, 119 | data2: u16, 120 | _pad2: u16, 121 | } 122 | ``` 123 | 124 | The latter case quite simply wastes space. An optimal use of space 125 | requires different monomorphizations to have *different field orderings*. 126 | 127 | Enums make this consideration even more complicated. Naively, an enum such as: 128 | 129 | ```rust 130 | enum Foo { 131 | A(u32), 132 | B(u64), 133 | C(u8), 134 | } 135 | ``` 136 | 137 | might be laid out as: 138 | 139 | ```rust 140 | struct FooRepr { 141 | data: u64, // this is either a u64, u32, or u8 based on `tag` 142 | tag: u8, // 0 = A, 1 = B, 2 = C 143 | } 144 | ``` 145 | 146 | And indeed this is approximately how it would be laid out (modulo the 147 | size and position of `tag`). 148 | 149 | However there are several cases where such a representation is inefficient. The 150 | classic case of this is Rust's "null pointer optimization": an enum consisting 151 | of a single outer unit variant (e.g. `None`) and a (potentially nested) non- 152 | nullable pointer variant (e.g. `Some(&T)`) makes the tag unnecessary. A null 153 | pointer can safely be interpreted as the unit (`None`) variant. The net 154 | result is that, for example, `size_of::>() == size_of::<&T>()`. 155 | 156 | There are many types in Rust that are, or contain, non-nullable pointers such as 157 | `Box`, `Vec`, `String`, `&T`, and `&mut T`. Similarly, one can imagine 158 | nested enums pooling their tags into a single discriminant, as they are by 159 | definition known to have a limited range of valid values. In principle enums could 160 | use fairly elaborate algorithms to store bits throughout nested types with 161 | forbidden values. As such it is *especially* desirable that 162 | we leave enum layout unspecified today. 163 | 164 | [dst]: exotic-sizes.html#dynamically-sized-types-dsts 165 | -------------------------------------------------------------------------------- /src/dot-operator.md: -------------------------------------------------------------------------------- 1 | # The Dot Operator 2 | 3 | The dot operator will perform a lot of magic to convert types. 4 | It will perform auto-referencing, auto-dereferencing, and coercion until types 5 | match. 6 | The detailed mechanics of method lookup are defined [here][method_lookup], 7 | but here is a brief overview that outlines the main steps. 8 | 9 | Suppose we have a function `foo` that has a receiver (a `self`, `&self` or 10 | `&mut self` parameter). 11 | If we call `value.foo()`, the compiler needs to determine what type `Self` is before 12 | it can call the correct implementation of the function. 13 | For this example, we will say that `value` has type `T`. 14 | 15 | We will use [fully-qualified syntax][fqs] to be more clear about exactly which 16 | type we are calling a function on. 17 | 18 | - First, the compiler checks if it can call `T::foo(value)` directly. 19 | This is called a "by value" method call. 20 | - If it can't call this function (for example, if the function has the wrong type 21 | or a trait isn't implemented for `Self`), then the compiler tries to add in an 22 | automatic reference. 23 | This means that the compiler tries `<&T>::foo(value)` and `<&mut T>::foo(value)`. 24 | This is called an "autoref" method call. 25 | - If none of these candidates worked, it dereferences `T` and tries again. 26 | This uses the `Deref` trait - if `T: Deref` then it tries again with 27 | type `U` instead of `T`. 28 | If it can't dereference `T`, it can also try _unsizing_ `T`. 29 | This just means that if `T` has a size parameter known at compile time, it "forgets" 30 | it for the purpose of resolving methods. 31 | For instance, this unsizing step can convert `[i32; 2]` into `[i32]` by "forgetting" 32 | the size of the array. 33 | 34 | Here is an example of the method lookup algorithm: 35 | 36 | ```rust,ignore 37 | let array: Rc> = ...; 38 | let first_entry = array[0]; 39 | ``` 40 | 41 | How does the compiler actually compute `array[0]` when the array is behind so 42 | many indirections? 43 | First, `array[0]` is really just syntax sugar for the [`Index`][index] trait - 44 | the compiler will convert `array[0]` into `array.index(0)`. 45 | Now, the compiler checks to see if `array` implements `Index`, so that it can call 46 | the function. 47 | 48 | Then, the compiler checks if `Rc>` implements `Index`, but it 49 | does not, and neither do `&Rc>` or `&mut Rc>`. 50 | Since none of these worked, the compiler dereferences the `Rc>` into 51 | `Box<[T; 3]>` and tries again. 52 | `Box<[T; 3]>`, `&Box<[T; 3]>`, and `&mut Box<[T; 3]>` do not implement `Index`, 53 | so it dereferences again. 54 | `[T; 3]` and its autorefs also do not implement `Index`. 55 | It can't dereference `[T; 3]`, so the compiler unsizes it, giving `[T]`. 56 | Finally, `[T]` implements `Index`, so it can now call the actual `index` function. 57 | 58 | Consider the following more complicated example of the dot operator at work: 59 | 60 | ```rust 61 | fn do_stuff(value: &T) { 62 | let cloned = value.clone(); 63 | } 64 | ``` 65 | 66 | What type is `cloned`? 67 | First, the compiler checks if it can call by value. 68 | The type of `value` is `&T`, and so the `clone` function has signature 69 | `fn clone(&T) -> T`. 70 | It knows that `T: Clone`, so the compiler finds that `cloned: T`. 71 | 72 | What would happen if the `T: Clone` restriction was removed? It would not be able 73 | to call by value, since there is no implementation of `Clone` for `T`. 74 | So the compiler tries to call by autoref. 75 | In this case, the function has the signature `fn clone(&&T) -> &T` since 76 | `Self = &T`. 77 | The compiler sees that `&T: Clone`, and then deduces that `cloned: &T`. 78 | 79 | Here is another example where the autoref behavior is used to create some subtle 80 | effects: 81 | 82 | ```rust 83 | # use std::sync::Arc; 84 | # 85 | #[derive(Clone)] 86 | struct Container(Arc); 87 | 88 | fn clone_containers(foo: &Container, bar: &Container) { 89 | let foo_cloned = foo.clone(); 90 | let bar_cloned = bar.clone(); 91 | } 92 | ``` 93 | 94 | What types are `foo_cloned` and `bar_cloned`? 95 | We know that `Container: Clone`, so the compiler calls `clone` by value to give 96 | `foo_cloned: Container`. 97 | However, `bar_cloned` actually has type `&Container`. 98 | Surely this doesn't make sense - we added `#[derive(Clone)]` to `Container`, so it 99 | must implement `Clone`! 100 | Looking closer, the code generated by the `derive` macro is (roughly): 101 | 102 | ```rust,ignore 103 | impl Clone for Container where T: Clone { 104 | fn clone(&self) -> Self { 105 | Self(Arc::clone(&self.0)) 106 | } 107 | } 108 | ``` 109 | 110 | The derived `Clone` implementation is [only defined where `T: Clone`][clone], 111 | so there is no implementation for `Container: Clone` for a generic `T`. 112 | The compiler then looks to see if `&Container` implements `Clone`, which it does. 113 | So it deduces that `clone` is called by autoref, and so `bar_cloned` has type 114 | `&Container`. 115 | 116 | We can fix this by implementing `Clone` manually without requiring `T: Clone`: 117 | 118 | ```rust,ignore 119 | impl Clone for Container { 120 | fn clone(&self) -> Self { 121 | Self(Arc::clone(&self.0)) 122 | } 123 | } 124 | ``` 125 | 126 | Now, the type checker deduces that `bar_cloned: Container`. 127 | 128 | [fqs]: ../book/ch19-03-advanced-traits.html#fully-qualified-syntax-for-disambiguation-calling-methods-with-the-same-name 129 | [method_lookup]: https://rustc-dev-guide.rust-lang.org/hir-typeck/method-lookup.html 130 | [index]: ../std/ops/trait.Index.html 131 | [clone]: ../std/clone/trait.Clone.html#derivable 132 | -------------------------------------------------------------------------------- /src/aliasing.md: -------------------------------------------------------------------------------- 1 | # Aliasing 2 | 3 | First off, let's get some important caveats out of the way: 4 | 5 | * We will be using the broadest possible definition of aliasing for the sake 6 | of discussion. Rust's definition will probably be more restricted to factor 7 | in mutations and liveness. 8 | 9 | * We will be assuming a single-threaded, interrupt-free, execution. We will also 10 | be ignoring things like memory-mapped hardware. Rust assumes these things 11 | don't happen unless you tell it otherwise. For more details, see the 12 | [Concurrency Chapter](concurrency.html). 13 | 14 | With that said, here's our working definition: variables and pointers *alias* 15 | if they refer to overlapping regions of memory. 16 | 17 | ## Why Aliasing Matters 18 | 19 | So why should we care about aliasing? 20 | 21 | Consider this simple function: 22 | 23 | ```rust 24 | fn compute(input: &u32, output: &mut u32) { 25 | if *input > 10 { 26 | *output = 1; 27 | } 28 | if *input > 5 { 29 | *output *= 2; 30 | } 31 | // remember that `output` will be `2` if `input > 10` 32 | } 33 | ``` 34 | 35 | We would *like* to be able to optimize it to the following function: 36 | 37 | ```rust 38 | fn compute(input: &u32, output: &mut u32) { 39 | let cached_input = *input; // keep `*input` in a register 40 | if cached_input > 10 { 41 | // If the input is greater than 10, the previous code would set the output to 1 and then double it, 42 | // resulting in an output of 2 (because `>10` implies `>5`). 43 | // Here, we avoid the double assignment and just set it directly to 2. 44 | *output = 2; 45 | } else if cached_input > 5 { 46 | *output *= 2; 47 | } 48 | } 49 | ``` 50 | 51 | In Rust, this optimization should be sound. For almost any other language, it 52 | wouldn't be (barring global analysis). This is because the optimization relies 53 | on knowing that aliasing doesn't occur, which most languages are fairly liberal 54 | with. Specifically, we need to worry about function arguments that make `input` 55 | and `output` overlap, such as `compute(&x, &mut x)`. 56 | 57 | With that input, we could get this execution: 58 | 59 | 60 | ```rust,ignore 61 | // input == output == 0xabad1dea 62 | // *input == *output == 20 63 | if *input > 10 { // true (*input == 20) 64 | *output = 1; // also overwrites *input, because they are the same 65 | } 66 | if *input > 5 { // false (*input == 1) 67 | *output *= 2; 68 | } 69 | // *input == *output == 1 70 | ``` 71 | 72 | Our optimized function would produce `*output == 2` for this input, so the 73 | correctness of our optimization relies on this input being impossible. 74 | 75 | In Rust we know this input should be impossible because `&mut` isn't allowed to be 76 | aliased. So we can safely reject its possibility and perform this optimization. 77 | In most other languages, this input would be entirely possible, and must be considered. 78 | 79 | This is why alias analysis is important: it lets the compiler perform useful 80 | optimizations! Some examples: 81 | 82 | * keeping values in registers by proving no pointers access the value's memory 83 | * eliminating reads by proving some memory hasn't been written to since last we read it 84 | * eliminating writes by proving some memory is never read before the next write to it 85 | * moving or reordering reads and writes by proving they don't depend on each other 86 | 87 | These optimizations also tend to prove the soundness of bigger optimizations 88 | such as loop vectorization, constant propagation, and dead code elimination. 89 | 90 | In the previous example, we used the fact that `&mut u32` can't be aliased to prove 91 | that writes to `*output` can't possibly affect `*input`. This lets us cache `*input` 92 | in a register, eliminating a read. 93 | 94 | By caching this read, we knew that the write in the `> 10` branch couldn't 95 | affect whether we take the `> 5` branch, allowing us to also eliminate a 96 | read-modify-write (doubling `*output`) when `*input > 10`. 97 | 98 | The key thing to remember about alias analysis is that writes are the primary 99 | hazard for optimizations. That is, the only thing that prevents us 100 | from moving a read to any other part of the program is the possibility of us 101 | re-ordering it with a write to the same location. 102 | 103 | For instance, we have no concern for aliasing in the following modified version 104 | of our function, because we've moved the only write to `*output` to the very 105 | end of our function. This allows us to freely reorder the reads of `*input` that 106 | occur before it: 107 | 108 | ```rust 109 | fn compute(input: &u32, output: &mut u32) { 110 | let mut temp = *output; 111 | if *input > 10 { 112 | temp = 1; 113 | } 114 | if *input > 5 { 115 | temp *= 2; 116 | } 117 | *output = temp; 118 | } 119 | ``` 120 | 121 | We're still relying on alias analysis to assume that `input` doesn't alias 122 | `temp`, but the proof is much simpler: the value of a local variable can't be 123 | aliased by things that existed before it was declared. This is an assumption 124 | every language freely makes, and so this version of the function could be 125 | optimized the way we want in any language. 126 | 127 | This is why the definition of "alias" that Rust will use likely involves some 128 | notion of liveness and mutation: we don't actually care if aliasing occurs if 129 | there aren't any actual writes to memory happening. 130 | 131 | Of course, a full aliasing model for Rust must also take into consideration things like 132 | function calls (which may mutate things we don't see), raw pointers (which have 133 | no aliasing requirements on their own), and UnsafeCell (which lets the referent 134 | of an `&` be mutated). 135 | -------------------------------------------------------------------------------- /src/destructors.md: -------------------------------------------------------------------------------- 1 | # Destructors 2 | 3 | What the language *does* provide is full-blown automatic destructors through the 4 | `Drop` trait, which provides the following method: 5 | 6 | 7 | ```rust,ignore 8 | fn drop(&mut self); 9 | ``` 10 | 11 | This method gives the type time to somehow finish what it was doing. 12 | 13 | **After `drop` is run, Rust will recursively try to drop all of the fields 14 | of `self`.** 15 | 16 | This is a convenience feature so that you don't have to write "destructor 17 | boilerplate" to drop children. If a struct has no special logic for being 18 | dropped other than dropping its children, then it means `Drop` doesn't need to 19 | be implemented at all! 20 | 21 | **There is no stable way to prevent this behavior in Rust 1.0.** 22 | 23 | Note that taking `&mut self` means that even if you could suppress recursive 24 | Drop, Rust will prevent you from e.g. moving fields out of self. For most types, 25 | this is totally fine. 26 | 27 | For instance, a custom implementation of `Box` might write `Drop` like this: 28 | 29 | ```rust 30 | #![feature(ptr_internals, allocator_api)] 31 | 32 | use std::alloc::{Allocator, Global, GlobalAlloc, Layout}; 33 | use std::mem; 34 | use std::ptr::{drop_in_place, NonNull, Unique}; 35 | 36 | struct Box{ ptr: Unique } 37 | 38 | impl Drop for Box { 39 | fn drop(&mut self) { 40 | unsafe { 41 | drop_in_place(self.ptr.as_ptr()); 42 | let c: NonNull = self.ptr.into(); 43 | Global.deallocate(c.cast(), Layout::new::()) 44 | } 45 | } 46 | } 47 | # fn main() {} 48 | ``` 49 | 50 | and this works fine because when Rust goes to drop the `ptr` field it just sees 51 | a [Unique] that has no actual `Drop` implementation. Similarly nothing can 52 | use-after-free the `ptr` because when drop exits, it becomes inaccessible. 53 | 54 | However this wouldn't work: 55 | 56 | ```rust 57 | #![feature(allocator_api, ptr_internals)] 58 | 59 | use std::alloc::{Allocator, Global, GlobalAlloc, Layout}; 60 | use std::ptr::{drop_in_place, Unique, NonNull}; 61 | use std::mem; 62 | 63 | struct Box{ ptr: Unique } 64 | 65 | impl Drop for Box { 66 | fn drop(&mut self) { 67 | unsafe { 68 | drop_in_place(self.ptr.as_ptr()); 69 | let c: NonNull = self.ptr.into(); 70 | Global.deallocate(c.cast(), Layout::new::()); 71 | } 72 | } 73 | } 74 | 75 | struct SuperBox { my_box: Box } 76 | 77 | impl Drop for SuperBox { 78 | fn drop(&mut self) { 79 | unsafe { 80 | // Hyper-optimized: deallocate the box's contents for it 81 | // without `drop`ing the contents 82 | let c: NonNull = self.my_box.ptr.into(); 83 | Global.deallocate(c.cast::(), Layout::new::()); 84 | } 85 | } 86 | } 87 | # fn main() {} 88 | ``` 89 | 90 | After we deallocate the `box`'s ptr in SuperBox's destructor, Rust will 91 | happily proceed to tell the box to Drop itself and everything will blow up with 92 | use-after-frees and double-frees. 93 | 94 | Note that the recursive drop behavior applies to all structs and enums 95 | regardless of whether they implement Drop. Therefore something like 96 | 97 | ```rust 98 | struct Boxy { 99 | data1: Box, 100 | data2: Box, 101 | info: u32, 102 | } 103 | ``` 104 | 105 | will have the destructors of its `data1` and `data2` fields called whenever it "would" be 106 | dropped, even though it itself doesn't implement Drop. We say that such a type 107 | *needs Drop*, even though it is not itself Drop. 108 | 109 | Similarly, 110 | 111 | ```rust 112 | enum Link { 113 | Next(Box), 114 | None, 115 | } 116 | ``` 117 | 118 | will have its inner Box field dropped if and only if an instance stores the 119 | Next variant. 120 | 121 | In general this works really nicely because you don't need to worry about 122 | adding/removing drops when you refactor your data layout. Still there's 123 | certainly many valid use cases for needing to do trickier things with 124 | destructors. 125 | 126 | The classic safe solution to overriding recursive drop and allowing moving out 127 | of Self during `drop` is to use an Option: 128 | 129 | ```rust 130 | #![feature(allocator_api, ptr_internals)] 131 | 132 | use std::alloc::{Allocator, GlobalAlloc, Global, Layout}; 133 | use std::ptr::{drop_in_place, Unique, NonNull}; 134 | use std::mem; 135 | 136 | struct Box{ ptr: Unique } 137 | 138 | impl Drop for Box { 139 | fn drop(&mut self) { 140 | unsafe { 141 | drop_in_place(self.ptr.as_ptr()); 142 | let c: NonNull = self.ptr.into(); 143 | Global.deallocate(c.cast(), Layout::new::()); 144 | } 145 | } 146 | } 147 | 148 | struct SuperBox { my_box: Option> } 149 | 150 | impl Drop for SuperBox { 151 | fn drop(&mut self) { 152 | unsafe { 153 | // Hyper-optimized: deallocate the box's contents for it 154 | // without `drop`ing the contents. Need to set the `box` 155 | // field as `None` to prevent Rust from trying to Drop it. 156 | let my_box = self.my_box.take().unwrap(); 157 | let c: NonNull = my_box.ptr.into(); 158 | Global.deallocate(c.cast(), Layout::new::()); 159 | mem::forget(my_box); 160 | } 161 | } 162 | } 163 | # fn main() {} 164 | ``` 165 | 166 | However this has fairly odd semantics: you are saying that a field that *should* 167 | always be Some *may* be None, just because of what happens in the destructor. Of 168 | course this conversely makes a lot of sense: you can call arbitrary methods on 169 | self during the destructor, and this should prevent you from ever doing so after 170 | deinitializing the field. Not that it will prevent you from producing any other 171 | arbitrarily invalid state in there. 172 | 173 | On balance this is an ok choice. Certainly what you should reach for by default. 174 | However, in the future we expect there to be a first-class way to announce that 175 | a field shouldn't be automatically dropped. 176 | 177 | [Unique]: phantom-data.html 178 | -------------------------------------------------------------------------------- /src/unchecked-uninit.md: -------------------------------------------------------------------------------- 1 | # Unchecked Uninitialized Memory 2 | 3 | One interesting exception to this rule is working with arrays. Safe Rust doesn't 4 | permit you to partially initialize an array. When you initialize an array, you 5 | can either set every value to the same thing with `let x = [val; N]`, or you can 6 | specify each member individually with `let x = [val1, val2, val3]`. 7 | Unfortunately this is pretty rigid, especially if you need to initialize your 8 | array in a more incremental or dynamic way. 9 | 10 | Unsafe Rust gives us a powerful tool to handle this problem: 11 | [`MaybeUninit`]. This type can be used to handle memory that has not been fully 12 | initialized yet. 13 | 14 | With `MaybeUninit`, we can initialize an array element by element as follows: 15 | 16 | ```rust 17 | use std::mem::{self, MaybeUninit}; 18 | 19 | // Size of the array is hard-coded but easy to change (meaning, changing just 20 | // the constant is sufficient). This means we can't use [a, b, c] syntax to 21 | // initialize the array, though, as we would have to keep that in sync 22 | // with `SIZE`! 23 | const SIZE: usize = 10; 24 | 25 | let x = { 26 | // Create an uninitialized array of `MaybeUninit`. 27 | let mut x = [const { MaybeUninit::uninit() }; SIZE]; 28 | 29 | // Dropping a `MaybeUninit` does nothing. Thus using raw pointer 30 | // assignment instead of `ptr::write` does not cause the old 31 | // uninitialized value to be dropped. 32 | // Exception safety is not a concern because Box can't panic 33 | for i in 0..SIZE { 34 | x[i] = MaybeUninit::new(Box::new(i as u32)); 35 | } 36 | 37 | // Everything is initialized. Transmute the array to the 38 | // initialized type. 39 | unsafe { mem::transmute::<_, [Box; SIZE]>(x) } 40 | }; 41 | 42 | println!("{x:?}"); 43 | ``` 44 | 45 | This code proceeds in three steps: 46 | 47 | 1. Create an array of `MaybeUninit`. 48 | 49 | 2. Initialize the array. The subtle aspect of this is that usually, when we use 50 | `=` to assign to a value that the Rust type checker considers to already be 51 | initialized (like `x[i]`), the old value stored on the left-hand side gets 52 | dropped. This would be a disaster. However, in this case, the type of the 53 | left-hand side is `MaybeUninit>`, and dropping that does not do 54 | anything! See below for some more discussion of this `drop` issue. 55 | 56 | 3. Finally, we have to change the type of our array to remove the 57 | `MaybeUninit`. With current stable Rust, this requires a `transmute`. 58 | This transmute is legal because in memory, `MaybeUninit` looks the same as `T`. 59 | 60 | However, note that in general, `Container>>` does *not* look 61 | the same as `Container`! Imagine if `Container` was `Option`, and `T` was 62 | `bool`, then `Option` exploits that `bool` only has two valid values, 63 | but `Option>` cannot do that because the `bool` does not 64 | have to be initialized. 65 | 66 | So, it depends on `Container` whether transmuting away the `MaybeUninit` is 67 | allowed. For arrays, it is (and eventually the standard library will 68 | acknowledge that by providing appropriate methods). 69 | 70 | It's worth spending a bit more time on the loop in the middle, and in particular 71 | the assignment operator and its interaction with `drop`. If we wrote something like: 72 | 73 | 74 | ```rust,ignore 75 | *x[i].as_mut_ptr() = Box::new(i as u32); // WRONG! 76 | ``` 77 | 78 | we would actually overwrite a `Box`, leading to `drop` of uninitialized 79 | data, which would cause much sadness and pain. 80 | 81 | The correct alternative, if for some reason we cannot use `MaybeUninit::new`, is 82 | to use the [`ptr`] module. In particular, it provides three functions that allow 83 | us to assign bytes to a location in memory without dropping the old value: 84 | [`write`], [`copy`], and [`copy_nonoverlapping`]. 85 | 86 | * `ptr::write(ptr, val)` takes a `val` and moves it into the address pointed 87 | to by `ptr`. 88 | * `ptr::copy(src, dest, count)` copies the bits that `count` T items would occupy 89 | from src to dest. (this is equivalent to C's memmove -- note that the argument 90 | order is reversed!) 91 | * `ptr::copy_nonoverlapping(src, dest, count)` does what `copy` does, but a 92 | little faster on the assumption that the two ranges of memory don't overlap. 93 | (this is equivalent to C's memcpy -- note that the argument order is reversed!) 94 | 95 | It should go without saying that these functions, if misused, will cause serious 96 | havoc or just straight up Undefined Behavior. The only requirement of these 97 | functions *themselves* is that the locations you want to read and write 98 | are allocated and properly aligned. However, the ways writing arbitrary bits to 99 | arbitrary locations of memory can break things are basically uncountable! 100 | 101 | It's worth noting that you don't need to worry about `ptr::write`-style 102 | shenanigans with types which don't implement `Drop` or contain `Drop` types, 103 | because Rust knows not to try to drop them. This is what we relied on in the 104 | above example. 105 | 106 | However when working with uninitialized memory you need to be ever-vigilant for 107 | Rust trying to drop values you make like this before they're fully initialized. 108 | Every control path through that variable's scope must initialize the value 109 | before it ends, if it has a destructor. 110 | *[This includes code panicking](unwinding.html)*. `MaybeUninit` helps a bit 111 | here, because it does not implicitly drop its content - but all this really 112 | means in case of a panic is that instead of a double-free of the not yet 113 | initialized parts, you end up with a memory leak of the already initialized 114 | parts. 115 | 116 | Note that, to use the `ptr` methods, you need to first obtain a *raw pointer* to 117 | the data you want to initialize. It is illegal to construct a *reference* to 118 | uninitialized data, which implies that you have to be careful when obtaining 119 | said raw pointer: 120 | 121 | * For an array of `T`, you can use `base_ptr.add(idx)` where `base_ptr: *mut T` 122 | to compute the address of array index `idx`. This relies on 123 | how arrays are laid out in memory. 124 | * For a struct, however, in general we do not know how it is laid out, and we 125 | also cannot use `&mut base_ptr.field` as that would be creating a 126 | reference. So, you must carefully use the [raw reference][raw_reference] syntax. This creates 127 | a raw pointer to the field without creating an intermediate reference: 128 | 129 | ```rust 130 | use std::{ptr, mem::MaybeUninit}; 131 | 132 | struct Demo { 133 | field: bool, 134 | } 135 | 136 | let mut uninit = MaybeUninit::::uninit(); 137 | // `&uninit.as_mut().field` would create a reference to an uninitialized `bool`, 138 | // and thus be Undefined Behavior! 139 | let f1_ptr = unsafe { &raw mut (*uninit.as_mut_ptr()).field }; 140 | unsafe { f1_ptr.write(true); } 141 | 142 | let init = unsafe { uninit.assume_init() }; 143 | ``` 144 | 145 | One last remark: when reading old Rust code, you might stumble upon the 146 | deprecated `mem::uninitialized` function. That function used to be the only way 147 | to deal with uninitialized memory on the stack, but it turned out to be 148 | impossible to properly integrate with the rest of the language. Always use 149 | `MaybeUninit` instead in new code, and port old code over when you get the 150 | opportunity. 151 | 152 | And that's about it for working with uninitialized memory! Basically nothing 153 | anywhere expects to be handed uninitialized memory, so if you're going to pass 154 | it around at all, be sure to be *really* careful. 155 | 156 | [`MaybeUninit`]: ../core/mem/union.MaybeUninit.html 157 | [`ptr`]: ../core/ptr/index.html 158 | [raw_reference]: ../reference/types/pointer.html#r-type.pointer.raw.constructor 159 | [`write`]: ../core/ptr/fn.write.html 160 | [`copy`]: ../std/ptr/fn.copy.html 161 | [`copy_nonoverlapping`]: ../std/ptr/fn.copy_nonoverlapping.html 162 | -------------------------------------------------------------------------------- /src/exotic-sizes.md: -------------------------------------------------------------------------------- 1 | # Exotically Sized Types 2 | 3 | Most of the time, we expect types to have a statically known and positive size. 4 | This isn't always the case in Rust. 5 | 6 | ## Dynamically Sized Types (DSTs) 7 | 8 | Rust supports Dynamically Sized Types (DSTs): types without a statically 9 | known size or alignment. On the surface, this is a bit nonsensical: Rust *must* 10 | know the size and alignment of something in order to correctly work with it! In 11 | this regard, DSTs are not normal types. Since they lack a statically known 12 | size, these types can only exist behind a pointer. Any pointer to a 13 | DST consequently becomes a *wide* pointer consisting of the pointer and the 14 | information that "completes" them (more on this below). 15 | 16 | There are two major DSTs exposed by the language: 17 | 18 | * trait objects: `dyn MyTrait` 19 | * slices: [`[T]`][slice], [`str`], and others 20 | 21 | A trait object represents some type that implements the traits it specifies. 22 | The exact original type is *erased* in favor of runtime reflection 23 | with a vtable containing all the information necessary to use the type. 24 | The information that completes a trait object pointer is the vtable pointer. 25 | The runtime size of the pointee can be dynamically requested from the vtable. 26 | 27 | A slice is simply a view into some contiguous storage -- typically an array or 28 | `Vec`. The information that completes a slice pointer is just the number of elements 29 | it points to. The runtime size of the pointee is just the statically known size 30 | of an element multiplied by the number of elements. 31 | 32 | Structs can actually store a single DST directly as their last field, but this 33 | makes them a DST as well: 34 | 35 | ```rust 36 | // Can't be stored on the stack directly 37 | struct MySuperSlice { 38 | info: u32, 39 | data: [u8], 40 | } 41 | ``` 42 | 43 | Unfortunately, such a type is largely useless without a way to construct it. Currently the 44 | only properly supported way to create a custom DST is by making your type generic 45 | and performing an *unsizing coercion*: 46 | 47 | ```rust 48 | struct MySuperSliceable { 49 | info: u32, 50 | data: T, 51 | } 52 | 53 | fn main() { 54 | let sized: MySuperSliceable<[u8; 8]> = MySuperSliceable { 55 | info: 17, 56 | data: [0; 8], 57 | }; 58 | 59 | let dynamic: &MySuperSliceable<[u8]> = &sized; 60 | 61 | // prints: "17 [0, 0, 0, 0, 0, 0, 0, 0]" 62 | println!("{} {:?}", dynamic.info, &dynamic.data); 63 | } 64 | ``` 65 | 66 | (Yes, custom DSTs are a largely half-baked feature for now.) 67 | 68 | ## Zero Sized Types (ZSTs) 69 | 70 | Rust also allows types to be specified that occupy no space: 71 | 72 | ```rust 73 | struct Nothing; // No fields = no size 74 | 75 | // All fields have no size = no size 76 | struct LotsOfNothing { 77 | foo: Nothing, 78 | qux: (), // empty tuple has no size 79 | baz: [u8; 0], // empty array has no size 80 | } 81 | ``` 82 | 83 | On their own, Zero Sized Types (ZSTs) are, for obvious reasons, pretty useless. 84 | However as with many curious layout choices in Rust, their potential is realized 85 | in a generic context: Rust largely understands that any operation that produces 86 | or stores a ZST can be reduced to a no-op. First off, storing it doesn't even 87 | make sense -- it doesn't occupy any space. Also there's only one value of that 88 | type, so anything that loads it can just produce it from the aether -- which is 89 | also a no-op since it doesn't occupy any space. 90 | 91 | One of the most extreme examples of this is Sets and Maps. Given a 92 | `Map`, it is common to implement a `Set` as just a thin wrapper 93 | around `Map`. In many languages, this would necessitate 94 | allocating space for UselessJunk and doing work to store and load UselessJunk 95 | only to discard it. Proving this unnecessary would be a difficult analysis for 96 | the compiler. 97 | 98 | However in Rust, we can just say that `Set = Map`. Now Rust 99 | statically knows that every load and store is useless, and no allocation has any 100 | size. The result is that the monomorphized code is basically a custom 101 | implementation of a HashSet with none of the overhead that HashMap would have to 102 | support values. 103 | 104 | Safe code need not worry about ZSTs, but *unsafe* code must be careful about the 105 | consequence of types with no size. In particular, pointer offsets are no-ops, 106 | and allocators typically [require a non-zero size][alloc]. 107 | 108 | Note that references to ZSTs (including empty slices), just like all other 109 | references, must be non-null and suitably aligned. However, loading or storing 110 | through a null pointer to a ZST is not [undefined behavior][ub], unlike 111 | pointers to other types. 112 | 113 | [alloc]: ../std/alloc/trait.GlobalAlloc.html#tymethod.alloc 114 | [ub]: what-unsafe-does.html 115 | 116 | ## Empty Types 117 | 118 | Rust also enables types to be declared that *cannot even be instantiated*. These 119 | types can only be talked about at the type level, and never at the value level. 120 | Empty types can be declared by specifying an enum with no variants: 121 | 122 | ```rust 123 | enum Void {} // No variants = EMPTY 124 | ``` 125 | 126 | Empty types are even more marginal than ZSTs. The primary motivating example for 127 | an empty type is type-level unreachability. For instance, suppose an API needs to 128 | return a Result in general, but a specific case actually is infallible. It's 129 | actually possible to communicate this at the type level by returning a 130 | `Result`. Consumers of the API can confidently unwrap such a Result 131 | knowing that it's *statically impossible* for this value to be an `Err`, as 132 | this would require providing a value of type `Void`. 133 | 134 | In principle, Rust can do some interesting analyses and optimizations based 135 | on this fact. For instance, `Result` is represented as just `T`, 136 | because the `Err` case doesn't actually exist (strictly speaking, this is only 137 | an optimization that is not guaranteed, so for example transmuting one into the 138 | other is still Undefined Behavior). 139 | 140 | The following also compiles: 141 | 142 | ```rust 143 | enum Void {} 144 | 145 | let res: Result = Ok(0); 146 | 147 | // Err doesn't exist anymore, so Ok is actually irrefutable. 148 | let Ok(num) = res; 149 | ``` 150 | 151 | One final subtle detail about empty types is that raw pointers to them are 152 | actually valid to construct, but dereferencing them is Undefined Behavior 153 | because that wouldn't make sense. 154 | 155 | We recommend against modelling C's `void*` type with `*const Void`. 156 | A lot of people started doing that but quickly ran into trouble because 157 | Rust doesn't really have any safety guards against trying to instantiate 158 | empty types with unsafe code, and if you do it, it's Undefined Behavior. 159 | This was especially problematic because developers had a habit of converting 160 | raw pointers to references and `&Void` is *also* Undefined Behavior to 161 | construct. 162 | 163 | `*const ()` (or equivalent) works reasonably well for `void*`, and can be made 164 | into a reference without any safety problems. It still doesn't prevent you from 165 | trying to read or write values, but at least it compiles to a no-op instead 166 | of Undefined Behavior. 167 | 168 | ## Extern Types 169 | 170 | There is [an accepted RFC][extern-types] to add proper types with an unknown size, 171 | called *extern types*, which would let Rust developers model things like C's `void*` 172 | and other "declared but never defined" types more accurately. However as of 173 | Rust 2018, [the feature is stuck in limbo over how `size_of_val::()` 174 | should behave][extern-types-issue]. 175 | 176 | [extern-types]: https://github.com/rust-lang/rfcs/blob/master/text/1861-extern-types.md 177 | [extern-types-issue]: https://github.com/rust-lang/rust/issues/43467 178 | [`str`]: ../std/primitive.str.html 179 | [slice]: ../std/primitive.slice.html 180 | -------------------------------------------------------------------------------- /src/exception-safety.md: -------------------------------------------------------------------------------- 1 | # Exception Safety 2 | 3 | Although programs should use unwinding sparingly, there's a lot of code that 4 | *can* panic. If you unwrap a None, index out of bounds, or divide by 0, your 5 | program will panic. On debug builds, every arithmetic operation can panic 6 | if it overflows. Unless you are very careful and tightly control what code runs, 7 | pretty much everything can unwind, and you need to be ready for it. 8 | 9 | Being ready for unwinding is often referred to as *exception safety* 10 | in the broader programming world. In Rust, there are two levels of exception 11 | safety that one may concern themselves with: 12 | 13 | * In unsafe code, we *must* be exception safe to the point of not violating 14 | memory safety. We'll call this *minimal* exception safety. 15 | 16 | * In safe code, it is *good* to be exception safe to the point of your program 17 | doing the right thing. We'll call this *maximal* exception safety. 18 | 19 | As is the case in many places in Rust, Unsafe code must be ready to deal with 20 | bad Safe code when it comes to unwinding. Code that transiently creates 21 | unsound states must be careful that a panic does not cause that state to be 22 | used. Generally this means ensuring that only non-panicking code is run while 23 | these states exist, or making a guard that cleans up the state in the case of 24 | a panic. This does not necessarily mean that the state a panic witnesses is a 25 | fully coherent state. We need only guarantee that it's a *safe* state. 26 | 27 | Most Unsafe code is leaf-like, and therefore fairly easy to make exception-safe. 28 | It controls all the code that runs, and most of that code can't panic. However 29 | it is not uncommon for Unsafe code to work with arrays of temporarily 30 | uninitialized data while repeatedly invoking caller-provided code. Such code 31 | needs to be careful and consider exception safety. 32 | 33 | ## Vec::push_all 34 | 35 | `Vec::push_all` is a temporary hack to get extending a Vec by a slice reliably 36 | efficient without specialization. Here's a simple implementation: 37 | 38 | 39 | ```rust,ignore 40 | impl Vec { 41 | fn push_all(&mut self, to_push: &[T]) { 42 | self.reserve(to_push.len()); 43 | unsafe { 44 | // can't overflow because we just reserved this 45 | self.set_len(self.len() + to_push.len()); 46 | 47 | for (i, x) in to_push.iter().enumerate() { 48 | self.ptr().add(i).write(x.clone()); 49 | } 50 | } 51 | } 52 | } 53 | ``` 54 | 55 | We bypass `push` in order to avoid redundant capacity and `len` checks on the 56 | Vec that we definitely know has capacity. The logic is totally correct, except 57 | there's a subtle problem with our code: it's not exception-safe! `set_len`, 58 | `add`, and `write` are all fine; `clone` is the panic bomb we over-looked. 59 | 60 | Clone is completely out of our control, and is totally free to panic. If it 61 | does, our function will exit early with the length of the Vec set too large. If 62 | the Vec is looked at or dropped, uninitialized memory will be read! 63 | 64 | The fix in this case is fairly simple. If we want to guarantee that the values 65 | we *did* clone are dropped, we can set the `len` every loop iteration. If we 66 | just want to guarantee that uninitialized memory can't be observed, we can set 67 | the `len` after the loop. 68 | 69 | ## BinaryHeap::sift_up 70 | 71 | Bubbling an element up a heap is a bit more complicated than extending a Vec. 72 | The pseudocode is as follows: 73 | 74 | ```text 75 | bubble_up(heap, index): 76 | while index != 0 && heap[index] < heap[parent(index)]: 77 | heap.swap(index, parent(index)) 78 | index = parent(index) 79 | ``` 80 | 81 | A literal transcription of this code to Rust is totally fine, but has an annoying 82 | performance characteristic: the `self` element is swapped over and over again 83 | uselessly. We would rather have the following: 84 | 85 | ```text 86 | bubble_up(heap, index): 87 | let elem = heap[index] 88 | while index != 0 && elem < heap[parent(index)]: 89 | heap[index] = heap[parent(index)] 90 | index = parent(index) 91 | heap[index] = elem 92 | ``` 93 | 94 | This code ensures that each element is copied as little as possible (it is in 95 | fact necessary that elem be copied twice in general). However it now exposes 96 | some exception safety trouble! At all times, there exists two copies of one 97 | value. If we panic in this function something will be double-dropped. 98 | Unfortunately, we also don't have full control of the code: that comparison is 99 | user-defined! 100 | 101 | Unlike Vec, the fix isn't as easy here. One option is to break the user-defined 102 | code and the unsafe code into two separate phases: 103 | 104 | ```text 105 | bubble_up(heap, index): 106 | let end_index = index; 107 | while end_index != 0 && heap[index] < heap[parent(end_index)]: 108 | end_index = parent(end_index) 109 | 110 | let elem = heap[index] 111 | while index != end_index: 112 | heap[index] = heap[parent(index)] 113 | index = parent(index) 114 | heap[index] = elem 115 | ``` 116 | 117 | If the user-defined code blows up, that's no problem anymore, because we haven't 118 | actually touched the state of the heap yet. Once we do start messing with the 119 | heap, we're working with only data and functions that we trust, so there's no 120 | concern of panics. 121 | 122 | Perhaps you're not happy with this design. Surely it's cheating! And we have 123 | to do the complex heap traversal *twice*! Alright, let's bite the bullet. Let's 124 | intermix untrusted and unsafe code *for reals*. 125 | 126 | If Rust had `try` and `finally` like in Java, we could do the following: 127 | 128 | ```text 129 | bubble_up(heap, index): 130 | let elem = heap[index] 131 | try: 132 |        while index != 0 && elem < heap[parent(index)]: 133 | heap[index] = heap[parent(index)] 134 | index = parent(index) 135 | finally: 136 | heap[index] = elem 137 | ``` 138 | 139 | The basic idea is simple: if the comparison panics, we just toss the loose 140 | element in the logically uninitialized index and bail out. Anyone who observes 141 | the heap will see a potentially *inconsistent* heap, but at least it won't 142 | cause any double-drops! If the algorithm terminates normally, then this 143 | operation happens to coincide precisely with how we finish up regardless. 144 | 145 | Sadly, Rust has no such construct, so we're going to need to roll our own! The 146 | way to do this is to store the algorithm's state in a separate struct with a 147 | destructor for the "finally" logic. Whether we panic or not, that destructor 148 | will run and clean up after us. 149 | 150 | 151 | ```rust,ignore 152 | struct Hole<'a, T: 'a> { 153 | data: &'a mut [T], 154 | /// `elt` is always `Some` from new until drop. 155 | elt: Option, 156 | pos: usize, 157 | } 158 | 159 | impl<'a, T> Hole<'a, T> { 160 | fn new(data: &'a mut [T], pos: usize) -> Self { 161 | unsafe { 162 | let elt = ptr::read(&data[pos]); 163 | Hole { 164 | data, 165 | elt: Some(elt), 166 | pos, 167 | } 168 | } 169 | } 170 | 171 | fn pos(&self) -> usize { self.pos } 172 | 173 | fn removed(&self) -> &T { self.elt.as_ref().unwrap() } 174 | 175 | fn get(&self, index: usize) -> &T { &self.data[index] } 176 | 177 | unsafe fn move_to(&mut self, index: usize) { 178 | let index_ptr: *const _ = &self.data[index]; 179 | let hole_ptr = &mut self.data[self.pos]; 180 | ptr::copy_nonoverlapping(index_ptr, hole_ptr, 1); 181 | self.pos = index; 182 | } 183 | } 184 | 185 | impl<'a, T> Drop for Hole<'a, T> { 186 | fn drop(&mut self) { 187 | // fill the hole again 188 | unsafe { 189 | let pos = self.pos; 190 | ptr::write(&mut self.data[pos], self.elt.take().unwrap()); 191 | } 192 | } 193 | } 194 | 195 | impl BinaryHeap { 196 | fn sift_up(&mut self, pos: usize) { 197 | unsafe { 198 | // Take out the value at `pos` and create a hole. 199 | let mut hole = Hole::new(&mut self.data, pos); 200 | 201 | while hole.pos() != 0 { 202 | let parent = parent(hole.pos()); 203 | if hole.removed() <= hole.get(parent) { break } 204 | hole.move_to(parent); 205 | } 206 | // Hole will be unconditionally filled here; panic or not! 207 | } 208 | } 209 | } 210 | ``` 211 | -------------------------------------------------------------------------------- /src/other-reprs.md: -------------------------------------------------------------------------------- 1 | # Alternative representations 2 | 3 | Rust allows you to specify alternative data layout strategies from the default. 4 | 5 | ## repr(C) 6 | 7 | This is the most important `repr`. It has fairly simple intent: do what C does. 8 | The order, size, and alignment of fields is exactly what you would expect from C 9 | or C++. The type is also passed across `extern "C"` function call boundaries the 10 | same way C would pass the corresponding type. Any type you expect to pass through an FFI boundary should have 11 | `repr(C)`, as C is the lingua-franca of the programming world. This is also 12 | necessary to soundly do more elaborate tricks with data layout such as 13 | reinterpreting values as a different type. 14 | 15 | We strongly recommend using [rust-bindgen] and/or [cbindgen] to manage your FFI 16 | boundaries for you. The Rust team works closely with those projects to ensure 17 | that they work robustly and are compatible with current and future guarantees 18 | about type layouts and `repr`s. 19 | 20 | The interaction of `repr(C)` with Rust's more exotic data layout features must be 21 | kept in mind. Due to its dual purpose as "for FFI" and "for layout control", 22 | `repr(C)` can be applied to types that will be nonsensical or problematic if 23 | passed through the FFI boundary. 24 | 25 | * ZSTs are still zero-sized, even though this is not a standard behavior in 26 | C, and is explicitly contrary to the behavior of an empty type in C++, which 27 | says they should still consume a byte of space. 28 | 29 | * DST pointers (wide pointers) and tuples are not a concept 30 | in C, and as such are never FFI-safe. 31 | 32 | * Enums with fields also aren't a concept in C or C++, but a valid bridging 33 | of the types [is defined][really-tagged]. 34 | 35 | * If `T` is an [FFI-safe non-nullable pointer 36 | type](ffi.html#the-nullable-pointer-optimization), 37 | `Option` is guaranteed to have the same layout and ABI as `T` and is 38 | therefore also FFI-safe. As of this writing, this covers `&`, `&mut`, 39 | and function pointers, all of which can never be null. 40 | 41 | * Tuple structs are like structs with regards to `repr(C)`, as the only 42 | difference from a struct is that the fields aren’t named. 43 | 44 | * `repr(C)` is equivalent to one of `repr(u*)` (see the next section) for 45 | fieldless enums. The chosen size and sign is the default enum size and sign for the target platform's C 46 | application binary interface (ABI). Note that enum representation in C is implementation 47 | defined, so this is really a "best guess". In particular, this may be incorrect 48 | when the C code of interest is compiled with certain flags. 49 | 50 | * Fieldless enums with `repr(C)` or `repr(u*)` still may not be set to an 51 | integer value without a corresponding variant, even though this is 52 | permitted behavior in C or C++. It is undefined behavior to (unsafely) 53 | construct an instance of an enum that does not match one of its 54 | variants. (This allows exhaustive matches to continue to be written and 55 | compiled as normal.) 56 | 57 | ## repr(transparent) 58 | 59 | `#[repr(transparent)]` can only be used on a struct or single-variant enum that has a single non-zero-sized field (there may be additional zero-sized fields). 60 | The effect is that the layout and ABI of the whole struct/enum is guaranteed to be the same as that one field. 61 | 62 | > NOTE: There's a `transparent_unions` nightly feature to apply `repr(transparent)` to unions, 63 | > but it hasn't been stabilized due to design concerns. See the [tracking issue][issue-60405] for more details. 64 | 65 | The goal is to make it possible to transmute between the single field and the 66 | struct/enum. An example of that is [`UnsafeCell`], which can be transmuted into 67 | the type it wraps ([`UnsafeCell`] also uses the unstable [no_niche][no-niche-pull], 68 | so its ABI is not actually guaranteed to be the same when nested in other types). 69 | 70 | Also, passing the struct/enum through FFI where the inner field type is expected on 71 | the other side is guaranteed to work. In particular, this is necessary for 72 | `struct Foo(f32)` or `enum Foo { Bar(f32) }` to always have the same ABI as `f32`. 73 | 74 | This repr is only considered part of the public ABI of a type if either the single 75 | field is `pub`, or if its layout is documented in prose. Otherwise, the layout should 76 | not be relied upon by other crates. 77 | 78 | More details are in the [RFC 1758][rfc-transparent] and the [RFC 2645][rfc-transparent-unions-enums]. 79 | 80 | ## repr(u*), repr(i*) 81 | 82 | These specify the size and sign to make a fieldless enum. If the discriminant overflows 83 | the integer it has to fit in, it will produce a compile-time error. You can 84 | manually ask Rust to allow this by setting the overflowing element to explicitly 85 | be 0. However Rust will not allow you to create an enum where two variants have 86 | the same discriminant. 87 | 88 | The term "fieldless enum" only means that the enum doesn't have data in any 89 | of its variants. A fieldless enum without a `repr` is 90 | still a Rust native type, and does not have a stable layout or representation. 91 | Adding a `repr(u*)`/`repr(i*)` causes it to be treated exactly like the specified 92 | integer type for layout purposes (except that the compiler will still exploit its 93 | knowledge of "invalid" values at this type to optimize enum layout, such as when 94 | this enum is wrapped in `Option`). Note that the function call ABI for these 95 | types is still in general unspecified, except that across `extern "C"` calls they 96 | are ABI-compatible with C enums of the same sign and size. 97 | 98 | If the enum has fields, the effect is similar to the effect of `repr(C)` 99 | in that there is a defined layout of the type. This makes it possible to 100 | pass the enum to C code, or access the type's raw representation and directly 101 | manipulate its tag and fields. See [the RFC][really-tagged] for details. 102 | 103 | These `repr`s have no effect on a struct. 104 | 105 | Adding an explicit `repr(u*)`, `repr(i*)`, or `repr(C)` to an enum with fields suppresses the null-pointer optimization, like: 106 | 107 | ```rust 108 | # use std::mem::size_of; 109 | enum MyOption { 110 | Some(T), 111 | None, 112 | } 113 | 114 | #[repr(u8)] 115 | enum MyReprOption { 116 | Some(T), 117 | None, 118 | } 119 | 120 | assert_eq!(8, size_of::>()); 121 | assert_eq!(16, size_of::>()); 122 | ``` 123 | 124 | This optimization still applies to fieldless enums with an explicit `repr(u*)`, `repr(i*)`, or `repr(C)`. 125 | 126 | ## repr(packed), repr(packed(n)) 127 | 128 | `repr(packed(n))` (where `n` is a power of two) forces the type to have an 129 | alignment of *at most* `n`. Most commonly used without an explicit `n`, 130 | `repr(packed)` is equivalent to `repr(packed(1))` which forces Rust to strip 131 | any padding, and only align the type to a byte. This may improve the memory 132 | footprint, but will likely have other negative side-effects. 133 | 134 | In particular, most architectures *strongly* prefer values to be naturally 135 | aligned. This may mean that unaligned loads are penalized (x86), or even fault 136 | (some ARM chips). For simple cases like directly loading or storing a packed 137 | field, the compiler might be able to paper over alignment issues with shifts 138 | and masks. However if you take a reference to a packed field, it's unlikely 139 | that the compiler will be able to emit code to avoid an unaligned load. 140 | 141 | [As this can cause undefined behavior][ub loads], the lint has been implemented 142 | and it will become a hard error. 143 | 144 | `repr(packed)/repr(packed(n))` is not to be used lightly. Unless you have 145 | extreme requirements, this should not be used. 146 | 147 | This repr is a modifier on `repr(C)` and `repr(Rust)`. For FFI compatibility 148 | you most likely always want to be explicit: `repr(C, packed)`. 149 | 150 | ## repr(align(n)) 151 | 152 | `repr(align(n))` (where `n` is a power of two) forces the type to have an 153 | alignment of *at least* `n`. 154 | 155 | This enables several tricks, like making sure neighboring elements of an array 156 | never share the same cache line with each other (which may speed up certain 157 | kinds of concurrent code). 158 | 159 | This is a modifier on `repr(C)` and `repr(Rust)`. It is incompatible with 160 | `repr(packed)`. 161 | 162 | [drop flags]: drop-flags.html 163 | [ub loads]: https://github.com/rust-lang/rust/issues/27060 164 | [issue-60405]: https://github.com/rust-lang/rust/issues/60405 165 | [`UnsafeCell`]: ../std/cell/struct.UnsafeCell.html 166 | [rfc-transparent]: https://github.com/rust-lang/rfcs/blob/master/text/1758-repr-transparent.md 167 | [rfc-transparent-unions-enums]: https://rust-lang.github.io/rfcs/2645-transparent-unions.html 168 | [really-tagged]: https://github.com/rust-lang/rfcs/blob/master/text/2195-really-tagged-unions.md 169 | [rust-bindgen]: https://rust-lang.github.io/rust-bindgen/ 170 | [cbindgen]: https://github.com/eqrion/cbindgen 171 | [no-niche-pull]: https://github.com/rust-lang/rust/pull/68491 172 | -------------------------------------------------------------------------------- /src/borrow-splitting.md: -------------------------------------------------------------------------------- 1 | # Splitting Borrows 2 | 3 | The mutual exclusion property of mutable references can be very limiting when 4 | working with a composite structure. The borrow checker (a.k.a. borrowck) 5 | understands some basic stuff, but will fall over pretty easily. It does 6 | understand structs sufficiently to know that it's possible to borrow disjoint 7 | fields of a struct simultaneously. So this works today: 8 | 9 | ```rust 10 | struct Foo { 11 | a: i32, 12 | b: i32, 13 | c: i32, 14 | } 15 | 16 | let mut x = Foo {a: 0, b: 0, c: 0}; 17 | let a = &mut x.a; 18 | let b = &mut x.b; 19 | let c = &x.c; 20 | *b += 1; 21 | let c2 = &x.c; 22 | *a += 10; 23 | println!("{} {} {} {}", a, b, c, c2); 24 | ``` 25 | 26 | However borrowck doesn't understand arrays or slices in any way, so this doesn't 27 | work: 28 | 29 | ```rust,compile_fail 30 | let mut x = [1, 2, 3]; 31 | let a = &mut x[0]; 32 | let b = &mut x[1]; 33 | println!("{} {}", a, b); 34 | ``` 35 | 36 | ```text 37 | error[E0499]: cannot borrow `x[..]` as mutable more than once at a time 38 | --> src/lib.rs:4:18 39 | | 40 | 3 | let a = &mut x[0]; 41 | | ---- first mutable borrow occurs here 42 | 4 | let b = &mut x[1]; 43 | | ^^^^ second mutable borrow occurs here 44 | 5 | println!("{} {}", a, b); 45 | 6 | } 46 | | - first borrow ends here 47 | 48 | error: aborting due to previous error 49 | ``` 50 | 51 | While it was plausible that borrowck could understand this simple case, it's 52 | pretty clearly hopeless for borrowck to understand disjointness in general 53 | container types like a tree, especially if distinct keys actually *do* map 54 | to the same value. 55 | 56 | In order to "teach" borrowck that what we're doing is ok, we need to drop down 57 | to unsafe code. For instance, mutable slices expose a `split_at_mut` function 58 | that consumes the slice and returns two mutable slices. One for everything to 59 | the left of the index, and one for everything to the right. Intuitively we know 60 | this is safe because the slices don't overlap, and therefore alias. However 61 | the implementation requires some unsafety: 62 | 63 | ```rust 64 | # use std::slice::from_raw_parts_mut; 65 | # struct FakeSlice(T); 66 | # impl FakeSlice { 67 | # fn len(&self) -> usize { unimplemented!() } 68 | # fn as_mut_ptr(&mut self) -> *mut T { unimplemented!() } 69 | pub fn split_at_mut(&mut self, mid: usize) -> (&mut [T], &mut [T]) { 70 | let len = self.len(); 71 | let ptr = self.as_mut_ptr(); 72 | 73 | unsafe { 74 | assert!(mid <= len); 75 | 76 | (from_raw_parts_mut(ptr, mid), 77 | from_raw_parts_mut(ptr.add(mid), len - mid)) 78 | } 79 | } 80 | # } 81 | ``` 82 | 83 | This is actually a bit subtle. So as to avoid ever making two `&mut`'s to the 84 | same value, we explicitly construct brand-new slices through raw pointers. 85 | 86 | However more subtle is how iterators that yield mutable references work. 87 | The iterator trait is defined as follows: 88 | 89 | ```rust 90 | trait Iterator { 91 | type Item; 92 | 93 | fn next(&mut self) -> Option; 94 | } 95 | ``` 96 | 97 | Given this definition, Self::Item has *no* connection to `self`. This means that 98 | we can call `next` several times in a row, and hold onto all the results 99 | *concurrently*. This is perfectly fine for by-value iterators, which have 100 | exactly these semantics. It's also actually fine for shared references, as they 101 | admit arbitrarily many references to the same thing (although the iterator needs 102 | to be a separate object from the thing being shared). 103 | 104 | But mutable references make this a mess. At first glance, they might seem 105 | completely incompatible with this API, as it would produce multiple mutable 106 | references to the same object! 107 | 108 | However it actually *does* work, exactly because iterators are one-shot objects. 109 | Everything an IterMut yields will be yielded at most once, so we don't 110 | actually ever yield multiple mutable references to the same piece of data. 111 | 112 | Perhaps surprisingly, mutable iterators don't require unsafe code to be 113 | implemented for many types! 114 | 115 | For instance here's a singly linked list: 116 | 117 | ```rust 118 | # fn main() {} 119 | type Link = Option>>; 120 | 121 | struct Node { 122 | elem: T, 123 | next: Link, 124 | } 125 | 126 | pub struct LinkedList { 127 | head: Link, 128 | } 129 | 130 | pub struct IterMut<'a, T: 'a>(Option<&'a mut Node>); 131 | 132 | impl LinkedList { 133 | fn iter_mut(&mut self) -> IterMut { 134 | IterMut(self.head.as_mut().map(|node| &mut **node)) 135 | } 136 | } 137 | 138 | impl<'a, T> Iterator for IterMut<'a, T> { 139 | type Item = &'a mut T; 140 | 141 | fn next(&mut self) -> Option { 142 | self.0.take().map(|node| { 143 | self.0 = node.next.as_mut().map(|node| &mut **node); 144 | &mut node.elem 145 | }) 146 | } 147 | } 148 | ``` 149 | 150 | Here's a mutable slice: 151 | 152 | ```rust 153 | # fn main() {} 154 | use std::mem; 155 | 156 | pub struct IterMut<'a, T: 'a>(&'a mut[T]); 157 | 158 | impl<'a, T> Iterator for IterMut<'a, T> { 159 | type Item = &'a mut T; 160 | 161 | fn next(&mut self) -> Option { 162 | let slice = mem::take(&mut self.0); 163 | if slice.is_empty() { return None; } 164 | 165 | let (l, r) = slice.split_at_mut(1); 166 | self.0 = r; 167 | l.get_mut(0) 168 | } 169 | } 170 | 171 | impl<'a, T> DoubleEndedIterator for IterMut<'a, T> { 172 | fn next_back(&mut self) -> Option { 173 | let slice = mem::take(&mut self.0); 174 | if slice.is_empty() { return None; } 175 | 176 | let new_len = slice.len() - 1; 177 | let (l, r) = slice.split_at_mut(new_len); 178 | self.0 = l; 179 | r.get_mut(0) 180 | } 181 | } 182 | ``` 183 | 184 | And here's a binary tree: 185 | 186 | ```rust 187 | # fn main() {} 188 | use std::collections::VecDeque; 189 | 190 | type Link = Option>>; 191 | 192 | struct Node { 193 | elem: T, 194 | left: Link, 195 | right: Link, 196 | } 197 | 198 | pub struct Tree { 199 | root: Link, 200 | } 201 | 202 | struct NodeIterMut<'a, T: 'a> { 203 | elem: Option<&'a mut T>, 204 | left: Option<&'a mut Node>, 205 | right: Option<&'a mut Node>, 206 | } 207 | 208 | enum State<'a, T: 'a> { 209 | Elem(&'a mut T), 210 | Node(&'a mut Node), 211 | } 212 | 213 | pub struct IterMut<'a, T: 'a>(VecDeque>); 214 | 215 | impl Tree { 216 | pub fn iter_mut(&mut self) -> IterMut { 217 | let mut deque = VecDeque::new(); 218 | if let Some(root) = self.root.as_mut() { 219 | deque.push_front(root.iter_mut()); 220 | } 221 | IterMut(deque) 222 | } 223 | } 224 | 225 | impl Node { 226 | pub fn iter_mut(&mut self) -> NodeIterMut { 227 | NodeIterMut { 228 | elem: Some(&mut self.elem), 229 | left: self.left.as_deref_mut(), 230 | right: self.right.as_deref_mut(), 231 | } 232 | } 233 | } 234 | 235 | impl<'a, T> Iterator for NodeIterMut<'a, T> { 236 | type Item = State<'a, T>; 237 | 238 | fn next(&mut self) -> Option { 239 | self.left.take().map(State::Node).or_else(|| { 240 | self.elem 241 | .take() 242 | .map(State::Elem) 243 | .or_else(|| self.right.take().map(State::Node)) 244 | }) 245 | } 246 | } 247 | 248 | impl<'a, T> DoubleEndedIterator for NodeIterMut<'a, T> { 249 | fn next_back(&mut self) -> Option { 250 | self.right.take().map(State::Node).or_else(|| { 251 | self.elem 252 | .take() 253 | .map(State::Elem) 254 | .or_else(|| self.left.take().map(State::Node)) 255 | }) 256 | } 257 | } 258 | 259 | impl<'a, T> Iterator for IterMut<'a, T> { 260 | type Item = &'a mut T; 261 | fn next(&mut self) -> Option { 262 | loop { 263 | match self.0.front_mut().and_then(Iterator::next) { 264 | Some(State::Elem(elem)) => return Some(elem), 265 | Some(State::Node(node)) => self.0.push_front(node.iter_mut()), 266 | None => { 267 | self.0.pop_front()?; 268 | } 269 | } 270 | } 271 | } 272 | } 273 | 274 | impl<'a, T> DoubleEndedIterator for IterMut<'a, T> { 275 | fn next_back(&mut self) -> Option { 276 | loop { 277 | match self.0.back_mut().and_then(DoubleEndedIterator::next_back) { 278 | Some(State::Elem(elem)) => return Some(elem), 279 | Some(State::Node(node)) => self.0.push_back(node.iter_mut()), 280 | None => { 281 | self.0.pop_back()?; 282 | } 283 | } 284 | } 285 | } 286 | } 287 | ``` 288 | 289 | All of these are completely safe and work on stable Rust! This ultimately 290 | falls out of the simple struct case we saw before: Rust understands that you 291 | can safely split a mutable reference into subfields. We can then encode 292 | permanently consuming a reference via Options (or in the case of slices, 293 | replacing with an empty slice). 294 | -------------------------------------------------------------------------------- /src/safe-unsafe-meaning.md: -------------------------------------------------------------------------------- 1 | # How Safe and Unsafe Interact 2 | 3 | What's the relationship between Safe Rust and Unsafe Rust? How do they 4 | interact? 5 | 6 | The separation between Safe Rust and Unsafe Rust is controlled with the 7 | `unsafe` keyword, which acts as an interface from one to the other. This is 8 | why we can say Safe Rust is a safe language: all the unsafe parts are kept 9 | exclusively behind the `unsafe` boundary. If you wish, you can even toss 10 | `#![forbid(unsafe_code)]` into your code base to statically guarantee that 11 | you're only writing Safe Rust. 12 | 13 | The `unsafe` keyword has two uses: to declare the existence of contracts the 14 | compiler can't check, and to declare that a programmer has checked that these 15 | contracts have been upheld. 16 | 17 | You can use `unsafe` to indicate the existence of unchecked contracts on 18 | _functions_ and _trait declarations_. On functions, `unsafe` means that 19 | users of the function must check that function's documentation to ensure 20 | they are using it in a way that maintains the contracts the function 21 | requires. On trait declarations, `unsafe` means that implementors of the 22 | trait must check the trait documentation to ensure their implementation 23 | maintains the contracts the trait requires. 24 | 25 | You can use `unsafe` on a block to declare that all unsafe actions performed 26 | within are verified to uphold the contracts of those operations. For instance, 27 | the index passed to [`slice::get_unchecked`][get_unchecked] is in-bounds. 28 | 29 | You can use `unsafe` on a trait implementation to declare that the implementation 30 | upholds the trait's contract. For instance, that a type implementing [`Send`] is 31 | really safe to move to another thread. 32 | 33 | The standard library has a number of unsafe functions, including: 34 | 35 | * [`slice::get_unchecked`][get_unchecked], which performs unchecked indexing, 36 | allowing memory safety to be freely violated. 37 | * [`mem::transmute`][transmute] reinterprets some value as having a given type, 38 | bypassing type safety in arbitrary ways (see [conversions] for details). 39 | * Every raw pointer to a sized type has an [`offset`][ptr_offset] method that 40 | invokes Undefined Behavior if the passed offset is not ["in bounds"][ptr_offset]. 41 | * All FFI (Foreign Function Interface) functions are `unsafe` to call because the 42 | other language can do arbitrary operations that the Rust compiler can't check. 43 | 44 | As of Rust 1.29.2 the standard library defines the following unsafe traits 45 | (there are others, but they are not stabilized yet and some of them may never 46 | be): 47 | 48 | * [`Send`] is a marker trait (a trait with no API) that promises implementors 49 | are safe to send (move) to another thread. 50 | * [`Sync`] is a marker trait that promises threads can safely share implementors 51 | through a shared reference. 52 | * [`GlobalAlloc`] allows customizing the memory allocator of the whole program. 53 | 54 | Much of the Rust standard library also uses Unsafe Rust internally. These 55 | implementations have generally been rigorously manually checked, so the Safe Rust 56 | interfaces built on top of these implementations can be assumed to be safe. 57 | 58 | The need for all of this separation boils down a single fundamental property 59 | of Safe Rust, the *soundness property*: 60 | 61 | **No matter what, Safe Rust can't cause Undefined Behavior.** 62 | 63 | The design of the safe/unsafe split means that there is an asymmetric trust 64 | relationship between Safe and Unsafe Rust. Safe Rust inherently has to 65 | trust that any Unsafe Rust it touches has been written correctly. 66 | On the other hand, Unsafe Rust cannot trust Safe Rust without care. 67 | 68 | As an example, Rust has the [`PartialOrd`] and [`Ord`] traits to differentiate 69 | between types which can "just" be compared, and those that provide a "total" 70 | ordering (which basically means that comparison behaves reasonably). 71 | 72 | [`BTreeMap`] doesn't really make sense for partially-ordered types, and so it 73 | requires that its keys implement `Ord`. However, `BTreeMap` has Unsafe Rust code 74 | inside of its implementation. Because it would be unacceptable for a sloppy `Ord` 75 | implementation (which is Safe to write) to cause Undefined Behavior, the Unsafe 76 | code in BTreeMap must be written to be robust against `Ord` implementations which 77 | aren't actually total — even though that's the whole point of requiring `Ord`. 78 | 79 | The Unsafe Rust code just can't trust the Safe Rust code to be written correctly. 80 | That said, `BTreeMap` will still behave completely erratically if you feed in 81 | values that don't have a total ordering. It just won't ever cause Undefined 82 | Behavior. 83 | 84 | One may wonder, if `BTreeMap` cannot trust `Ord` because it's Safe, why can it 85 | trust *any* Safe code? For instance `BTreeMap` relies on integers and slices to 86 | be implemented correctly. Those are safe too, right? 87 | 88 | The difference is one of scope. When `BTreeMap` relies on integers and slices, 89 | it's relying on one very specific implementation. This is a measured risk that 90 | can be weighed against the benefit. In this case there's basically zero risk; 91 | if integers and slices are broken, *everyone* is broken. Also, they're maintained 92 | by the same people who maintain `BTreeMap`, so it's easy to keep tabs on them. 93 | 94 | On the other hand, `BTreeMap`'s key type is generic. Trusting its `Ord` implementation 95 | means trusting every `Ord` implementation in the past, present, and future. 96 | Here the risk is high: someone somewhere is going to make a mistake and mess up 97 | their `Ord` implementation, or even just straight up lie about providing a total 98 | ordering because "it seems to work". When that happens, `BTreeMap` needs to be 99 | prepared. 100 | 101 | The same logic applies to trusting a closure that's passed to you to behave 102 | correctly. 103 | 104 | This problem of unbounded generic trust is the problem that `unsafe` traits 105 | exist to resolve. The `BTreeMap` type could theoretically require that keys 106 | implement a new trait called `UnsafeOrd`, rather than `Ord`, that might look 107 | like this: 108 | 109 | ```rust 110 | use std::cmp::Ordering; 111 | 112 | unsafe trait UnsafeOrd { 113 | fn cmp(&self, other: &Self) -> Ordering; 114 | } 115 | ``` 116 | 117 | Then, a type would use `unsafe` to implement `UnsafeOrd`, indicating that 118 | they've ensured their implementation maintains whatever contracts the 119 | trait expects. In this situation, the Unsafe Rust in the internals of 120 | `BTreeMap` would be justified in trusting that the key type's `UnsafeOrd` 121 | implementation is correct. If it isn't, it's the fault of the unsafe trait 122 | implementation, which is consistent with Rust's safety guarantees. 123 | 124 | The decision of whether to mark a trait `unsafe` is an API design choice. A 125 | safe trait is easier to implement, but any unsafe code that relies on it must 126 | defend against incorrect behavior. Marking a trait `unsafe` shifts this 127 | responsibility to the implementor. Rust has traditionally avoided marking 128 | traits `unsafe` because it makes Unsafe Rust pervasive, which isn't desirable. 129 | 130 | `Send` and `Sync` are marked unsafe because thread safety is a *fundamental 131 | property* that unsafe code can't possibly hope to defend against in the way it 132 | could defend against a buggy `Ord` implementation. Similarly, `GlobalAllocator` 133 | is keeping accounts of all the memory in the program and other things like 134 | `Box` or `Vec` build on top of it. If it does something weird (giving the same 135 | chunk of memory to another request when it is still in use), there's no chance 136 | to detect that and do anything about it. 137 | 138 | The decision of whether to mark your own traits `unsafe` depends on the same 139 | sort of consideration. If `unsafe` code can't reasonably expect to defend 140 | against a broken implementation of the trait, then marking the trait `unsafe` is 141 | a reasonable choice. 142 | 143 | As an aside, while `Send` and `Sync` are `unsafe` traits, they are *also* 144 | automatically implemented for types when such derivations are provably safe 145 | to do. `Send` is automatically derived for all types composed only of values 146 | whose types also implement `Send`. `Sync` is automatically derived for all 147 | types composed only of values whose types also implement `Sync`. This minimizes 148 | the pervasive unsafety of making these two traits `unsafe`. And not many people 149 | are going to *implement* memory allocators (or use them directly, for that 150 | matter). 151 | 152 | This is the balance between Safe and Unsafe Rust. The separation is designed to 153 | make using Safe Rust as ergonomic as possible, but requires extra effort and 154 | care when writing Unsafe Rust. The rest of this book is largely a discussion 155 | of the sort of care that must be taken, and what contracts Unsafe Rust must uphold. 156 | 157 | [`Send`]: ../std/marker/trait.Send.html 158 | [`Sync`]: ../std/marker/trait.Sync.html 159 | [`GlobalAlloc`]: ../std/alloc/trait.GlobalAlloc.html 160 | [conversions]: conversions.html 161 | [ptr_offset]: ../std/primitive.pointer.html#method.offset 162 | [get_unchecked]: ../std/primitive.slice.html#method.get_unchecked 163 | [transmute]: ../std/mem/fn.transmute.html 164 | [`PartialOrd`]: ../std/cmp/trait.PartialOrd.html 165 | [`Ord`]: ../std/cmp/trait.Ord.html 166 | [`BTreeMap`]: ../std/collections/struct.BTreeMap.html 167 | -------------------------------------------------------------------------------- /src/vec/vec-zsts.md: -------------------------------------------------------------------------------- 1 | # Handling Zero-Sized Types 2 | 3 | It's time. We're going to fight the specter that is zero-sized types. Safe Rust 4 | *never* needs to care about this, but Vec is very intensive on raw pointers and 5 | raw allocations, which are exactly the two things that care about 6 | zero-sized types. We need to be careful of two things: 7 | 8 | * The raw allocator API has undefined behavior if you pass in 0 for an 9 | allocation size. 10 | * raw pointer offsets are no-ops for zero-sized types, which will break our 11 | C-style pointer iterator. 12 | 13 | Thankfully we abstracted out pointer-iterators and allocating handling into 14 | `RawValIter` and `RawVec` respectively. How mysteriously convenient. 15 | 16 | ## Allocating Zero-Sized Types 17 | 18 | So if the allocator API doesn't support zero-sized allocations, what on earth 19 | do we store as our allocation? `NonNull::dangling()` of course! Almost every operation 20 | with a ZST is a no-op since ZSTs have exactly one value, and therefore no state needs 21 | to be considered to store or load them. This actually extends to `ptr::read` and 22 | `ptr::write`: they won't actually look at the pointer at all. As such we never need 23 | to change the pointer. 24 | 25 | Note however that our previous reliance on running out of memory before overflow is 26 | no longer valid with zero-sized types. We must explicitly guard against capacity 27 | overflow for zero-sized types. 28 | 29 | Due to our current architecture, all this means is writing 3 guards, one in each 30 | method of `RawVec`. 31 | 32 | 33 | ```rust,ignore 34 | impl RawVec { 35 | fn new() -> Self { 36 | // This branch should be stripped at compile time. 37 | let cap = if mem::size_of::() == 0 { usize::MAX } else { 0 }; 38 | 39 | // `NonNull::dangling()` doubles as "unallocated" and "zero-sized allocation" 40 | RawVec { 41 | ptr: NonNull::dangling(), 42 | cap, 43 | } 44 | } 45 | 46 | fn grow(&mut self) { 47 | // since we set the capacity to usize::MAX when T has size 0, 48 | // getting to here necessarily means the Vec is overfull. 49 | assert!(mem::size_of::() != 0, "capacity overflow"); 50 | 51 | let (new_cap, new_layout) = if self.cap == 0 { 52 | (1, Layout::array::(1).unwrap()) 53 | } else { 54 | // This can't overflow because we ensure self.cap <= isize::MAX. 55 | let new_cap = 2 * self.cap; 56 | 57 | // `Layout::array` checks that the number of bytes is <= usize::MAX, 58 | // but this is redundant since old_layout.size() <= isize::MAX, 59 | // so the `unwrap` should never fail. 60 | let new_layout = Layout::array::(new_cap).unwrap(); 61 | (new_cap, new_layout) 62 | }; 63 | 64 | // Ensure that the new allocation doesn't exceed `isize::MAX` bytes. 65 | assert!(new_layout.size() <= isize::MAX as usize, "Allocation too large"); 66 | 67 | let new_ptr = if self.cap == 0 { 68 | unsafe { alloc::alloc(new_layout) } 69 | } else { 70 | let old_layout = Layout::array::(self.cap).unwrap(); 71 | let old_ptr = self.ptr.as_ptr() as *mut u8; 72 | unsafe { alloc::realloc(old_ptr, old_layout, new_layout.size()) } 73 | }; 74 | 75 | // If allocation fails, `new_ptr` will be null, in which case we abort. 76 | self.ptr = match NonNull::new(new_ptr as *mut T) { 77 | Some(p) => p, 78 | None => alloc::handle_alloc_error(new_layout), 79 | }; 80 | self.cap = new_cap; 81 | } 82 | } 83 | 84 | impl Drop for RawVec { 85 | fn drop(&mut self) { 86 | let elem_size = mem::size_of::(); 87 | 88 | if self.cap != 0 && elem_size != 0 { 89 | unsafe { 90 | alloc::dealloc( 91 | self.ptr.as_ptr() as *mut u8, 92 | Layout::array::(self.cap).unwrap(), 93 | ); 94 | } 95 | } 96 | } 97 | } 98 | ``` 99 | 100 | That's it. We support pushing and popping zero-sized types now. Our iterators 101 | (that aren't provided by slice Deref) are still busted, though. 102 | 103 | ## Iterating Zero-Sized Types 104 | 105 | Zero-sized offsets are no-ops. This means that our current design will always 106 | initialize `start` and `end` as the same value, and our iterators will yield 107 | nothing. The current solution to this is to cast the pointers to integers, 108 | increment, and then cast them back: 109 | 110 | 111 | ```rust,ignore 112 | impl RawValIter { 113 | unsafe fn new(slice: &[T]) -> Self { 114 | RawValIter { 115 | start: slice.as_ptr(), 116 | end: if mem::size_of::() == 0 { 117 | ((slice.as_ptr() as usize) + slice.len()) as *const _ 118 | } else if slice.len() == 0 { 119 | slice.as_ptr() 120 | } else { 121 | slice.as_ptr().add(slice.len()) 122 | }, 123 | } 124 | } 125 | } 126 | ``` 127 | 128 | Now we have a different bug. Instead of our iterators not running at all, our 129 | iterators now run *forever*. We need to do the same trick in our iterator impls. 130 | Also, our size_hint computation code will divide by 0 for ZSTs. Since we'll 131 | basically be treating the two pointers as if they point to bytes, we'll just 132 | map size 0 to divide by 1. Here's what `next` will be: 133 | 134 | 135 | ```rust,ignore 136 | fn next(&mut self) -> Option { 137 | if self.start == self.end { 138 | None 139 | } else { 140 | unsafe { 141 | let result = ptr::read(self.start); 142 | self.start = if mem::size_of::() == 0 { 143 | (self.start as usize + 1) as *const _ 144 | } else { 145 | self.start.offset(1) 146 | }; 147 | Some(result) 148 | } 149 | } 150 | } 151 | ``` 152 | 153 | Do you see the "bug"? No one else did! The original author only noticed the 154 | problem when linking to this page years later. This code is kind of dubious 155 | because abusing the iterator pointers to be *counters* makes them unaligned! 156 | Our *one job* when using ZSTs is to keep pointers aligned! *forehead slap* 157 | 158 | Raw pointers don't need to be aligned at all times, so the basic trick of 159 | using pointers as counters is *fine*, but they *should* definitely be aligned 160 | when passed to `ptr::read`! This is *possibly* needless pedantry 161 | because `ptr::read` is a noop for a ZST, but let's be a *little* more 162 | responsible and read from `NonNull::dangling` on the ZST path. 163 | 164 | (Alternatively you could call `read_unaligned` on the ZST path. Either is fine, 165 | because either way we're making up a value from nothing and it all compiles 166 | to doing nothing.) 167 | 168 | 169 | ```rust,ignore 170 | impl Iterator for RawValIter { 171 | type Item = T; 172 | fn next(&mut self) -> Option { 173 | if self.start == self.end { 174 | None 175 | } else { 176 | unsafe { 177 | if mem::size_of::() == 0 { 178 | self.start = (self.start as usize + 1) as *const _; 179 | Some(ptr::read(NonNull::::dangling().as_ptr())) 180 | } else { 181 | let old_ptr = self.start; 182 | self.start = self.start.offset(1); 183 | Some(ptr::read(old_ptr)) 184 | } 185 | } 186 | } 187 | } 188 | 189 | fn size_hint(&self) -> (usize, Option) { 190 | let elem_size = mem::size_of::(); 191 | let len = (self.end as usize - self.start as usize) 192 | / if elem_size == 0 { 1 } else { elem_size }; 193 | (len, Some(len)) 194 | } 195 | } 196 | 197 | impl DoubleEndedIterator for RawValIter { 198 | fn next_back(&mut self) -> Option { 199 | if self.start == self.end { 200 | None 201 | } else { 202 | unsafe { 203 | if mem::size_of::() == 0 { 204 | self.end = (self.end as usize - 1) as *const _; 205 | Some(ptr::read(NonNull::::dangling().as_ptr())) 206 | } else { 207 | self.end = self.end.offset(-1); 208 | Some(ptr::read(self.end)) 209 | } 210 | } 211 | } 212 | } 213 | } 214 | ``` 215 | 216 | And that's it. Iteration works! 217 | 218 | One last thing we need to consider is that when our vector is dropped, it deallocates the memory that was allocated while it was alive. With ZSTs, we didn't allocate any memory; in fact, we never do. So, right now, our code has unsoundness: we're still trying to deallocate a `NonNull::dangling()` pointer that we use to simulate the ZST in our vector. This means we'd cause undefined behavior if we tried to deallocate something we never allocated (obviously, and for good reasons). To fix this, in our `RawVec`'s `Drop` trait, we're going to tweak it to ensure we only deallocate types that are sized. 219 | 220 | ```rust,ignore 221 | impl Drop for RawVec { 222 | fn drop(&mut self) { 223 | println!("RawVec Drop called, deallocating memory"); 224 | if self.cap != 0 && std::mem::size_of::() > 0 { 225 | let layout = std::alloc::Layout::array::(self.cap).unwrap(); 226 | unsafe { 227 | std::alloc::dealloc(self.ptr.as_ptr() as *mut _, layout); 228 | } 229 | } 230 | } 231 | } 232 | ``` 233 | 234 | -------------------------------------------------------------------------------- /src/vec/vec-alloc.md: -------------------------------------------------------------------------------- 1 | # Allocating Memory 2 | 3 | Using `NonNull` throws a wrench in an important feature of Vec (and indeed all of 4 | the std collections): creating an empty Vec doesn't actually allocate at all. This 5 | is not the same as allocating a zero-sized memory block, which is not allowed by 6 | the global allocator (it results in undefined behavior!). So if we can't allocate, 7 | but also can't put a null pointer in `ptr`, what do we do in `Vec::new`? Well, we 8 | just put some other garbage in there! 9 | 10 | This is perfectly fine because we already have `cap == 0` as our sentinel for no 11 | allocation. We don't even need to handle it specially in almost any code because 12 | we usually need to check if `cap > len` or `len > 0` anyway. The recommended 13 | Rust value to put here is `mem::align_of::()`. `NonNull` provides a convenience 14 | for this: `NonNull::dangling()`. There are quite a few places where we'll 15 | want to use `dangling` because there's no real allocation to talk about but 16 | `null` would make the compiler do bad things. 17 | 18 | So: 19 | 20 | 21 | ```rust,ignore 22 | use std::mem; 23 | 24 | impl Vec { 25 | pub fn new() -> Self { 26 | assert!(mem::size_of::() != 0, "We're not ready to handle ZSTs"); 27 | Vec { 28 | ptr: NonNull::dangling(), 29 | len: 0, 30 | cap: 0, 31 | } 32 | } 33 | } 34 | # fn main() {} 35 | ``` 36 | 37 | I slipped in that assert there because zero-sized types will require some 38 | special handling throughout our code, and I want to defer the issue for now. 39 | Without this assert, some of our early drafts will do some Very Bad Things. 40 | 41 | Next we need to figure out what to actually do when we *do* want space. For that, 42 | we use the global allocation functions [`alloc`][alloc], [`realloc`][realloc], 43 | and [`dealloc`][dealloc] which are available in stable Rust in 44 | [`std::alloc`][std_alloc]. These functions are expected to become deprecated in 45 | favor of the methods of [`std::alloc::Global`][Global] after this type is stabilized. 46 | 47 | We'll also need a way to handle out-of-memory (OOM) conditions. The standard 48 | library provides a function [`alloc::handle_alloc_error`][handle_alloc_error], 49 | which will abort the program in a platform-specific manner. 50 | The reason we abort and don't panic is because unwinding can cause allocations 51 | to happen, and that seems like a bad thing to do when your allocator just came 52 | back with "hey I don't have any more memory". 53 | 54 | Of course, this is a bit silly since most platforms don't actually run out of 55 | memory in a conventional way. Your operating system will probably kill the 56 | application by another means if you legitimately start using up all the memory. 57 | The most likely way we'll trigger OOM is by just asking for ludicrous quantities 58 | of memory at once (e.g. half the theoretical address space). As such it's 59 | *probably* fine to panic and nothing bad will happen. Still, we're trying to be 60 | like the standard library as much as possible, so we'll just kill the whole 61 | program. 62 | 63 | Okay, now we can write growing. Roughly, we want to have this logic: 64 | 65 | ```text 66 | if cap == 0: 67 | allocate() 68 | cap = 1 69 | else: 70 | reallocate() 71 | cap *= 2 72 | ``` 73 | 74 | But Rust's only supported allocator API is so low level that we'll need to do a 75 | fair bit of extra work. We also need to guard against some special 76 | conditions that can occur with really large allocations or empty allocations. 77 | 78 | In particular, `ptr::offset` will cause us a lot of trouble, because it has 79 | the semantics of LLVM's GEP inbounds instruction. If you're fortunate enough to 80 | not have dealt with this instruction, here's the basic story with GEP: alias 81 | analysis, alias analysis, alias analysis. It's super important to an optimizing 82 | compiler to be able to reason about data dependencies and aliasing. 83 | 84 | As a simple example, consider the following fragment of code: 85 | 86 | 87 | ```rust,ignore 88 | *x *= 7; 89 | *y *= 3; 90 | ``` 91 | 92 | If the compiler can prove that `x` and `y` point to different locations in 93 | memory, the two operations can in theory be executed in parallel (by e.g. 94 | loading them into different registers and working on them independently). 95 | However the compiler can't do this in general because if x and y point to 96 | the same location in memory, the operations need to be done to the same value, 97 | and they can't just be merged afterwards. 98 | 99 | When you use GEP inbounds, you are specifically telling LLVM that the offsets 100 | you're about to do are within the bounds of a single "allocated" entity. The 101 | ultimate payoff being that LLVM can assume that if two pointers are known to 102 | point to two disjoint objects, all the offsets of those pointers are *also* 103 | known to not alias (because you won't just end up in some random place in 104 | memory). LLVM is heavily optimized to work with GEP offsets, and inbounds 105 | offsets are the best of all, so it's important that we use them as much as 106 | possible. 107 | 108 | So that's what GEP's about, how can it cause us trouble? 109 | 110 | The first problem is that we index into arrays with unsigned integers, but 111 | GEP (and as a consequence `ptr::offset`) takes a signed integer. This means 112 | that half of the seemingly valid indices into an array will overflow GEP and 113 | actually go in the wrong direction! As such we must limit all allocations to 114 | `isize::MAX` elements. This actually means we only need to worry about 115 | byte-sized objects, because e.g. `> isize::MAX` `u16`s will truly exhaust all of 116 | the system's memory. However in order to avoid subtle corner cases where someone 117 | reinterprets some array of `< isize::MAX` objects as bytes, std limits all 118 | allocations to `isize::MAX` bytes. 119 | 120 | On all 64-bit targets that Rust currently supports we're artificially limited 121 | to significantly less than all 64 bits of the address space (modern x64 122 | platforms only expose 48-bit addressing), so we can rely on just running out of 123 | memory first. However on 32-bit targets, particularly those with extensions to 124 | use more of the address space (PAE x86 or x32), it's theoretically possible to 125 | successfully allocate more than `isize::MAX` bytes of memory. 126 | 127 | However since this is a tutorial, we're not going to be particularly optimal 128 | here, and just unconditionally check, rather than use clever platform-specific 129 | `cfg`s. 130 | 131 | The other corner-case we need to worry about is empty allocations. There will 132 | be two kinds of empty allocations we need to worry about: `cap = 0` for all T, 133 | and `cap > 0` for zero-sized types. 134 | 135 | These cases are tricky because they come 136 | down to what LLVM means by "allocated". LLVM's notion of an 137 | allocation is significantly more abstract than how we usually use it. Because 138 | LLVM needs to work with different languages' semantics and custom allocators, 139 | it can't really intimately understand allocation. Instead, the main idea behind 140 | allocation is "doesn't overlap with other stuff". That is, heap allocations, 141 | stack allocations, and globals don't randomly overlap. Yep, it's about alias 142 | analysis. As such, Rust can technically play a bit fast and loose with the notion of 143 | an allocation as long as it's *consistent*. 144 | 145 | Getting back to the empty allocation case, there are a couple of places where 146 | we want to offset by 0 as a consequence of generic code. The question is then: 147 | is it consistent to do so? For zero-sized types, we have concluded that it is 148 | indeed consistent to do a GEP inbounds offset by an arbitrary number of 149 | elements. This is a runtime no-op because every element takes up no space, 150 | and it's fine to pretend that there's infinite zero-sized types allocated 151 | at `0x01`. No allocator will ever allocate that address, because they won't 152 | allocate `0x00` and they generally allocate to some minimal alignment higher 153 | than a byte. Also generally the whole first page of memory is 154 | protected from being allocated anyway (a whole 4k, on many platforms). 155 | 156 | However what about for positive-sized types? That one's a bit trickier. In 157 | principle, you can argue that offsetting by 0 gives LLVM no information: either 158 | there's an element before the address or after it, but it can't know which. 159 | However we've chosen to conservatively assume that it may do bad things. As 160 | such we will guard against this case explicitly. 161 | 162 | *Phew* 163 | 164 | Ok with all the nonsense out of the way, let's actually allocate some memory: 165 | 166 | 167 | ```rust,ignore 168 | use std::alloc::{self, Layout}; 169 | 170 | impl Vec { 171 | fn grow(&mut self) { 172 | let (new_cap, new_layout) = if self.cap == 0 { 173 | (1, Layout::array::(1)) 174 | } else { 175 | // This can't overflow since self.cap <= isize::MAX. 176 | let new_cap = 2 * self.cap; 177 | (new_cap, Layout::array::(new_cap)) 178 | }; 179 | 180 | // `Layout::array` checks that the number of bytes allocated is 181 | // in 1..=isize::MAX and will error otherwise. An allocation of 182 | // 0 bytes isn't possible thanks to the above condition. 183 | let new_layout = new_layout.expect("Allocation too large"); 184 | 185 | let new_ptr = if self.cap == 0 { 186 | unsafe { alloc::alloc(new_layout) } 187 | } else { 188 | let old_layout = Layout::array::(self.cap).unwrap(); 189 | let old_ptr = self.ptr.as_ptr() as *mut u8; 190 | unsafe { alloc::realloc(old_ptr, old_layout, new_layout.size()) } 191 | }; 192 | 193 | // If allocation fails, `new_ptr` will be null, in which case we abort. 194 | self.ptr = match NonNull::new(new_ptr as *mut T) { 195 | Some(p) => p, 196 | None => alloc::handle_alloc_error(new_layout), 197 | }; 198 | self.cap = new_cap; 199 | } 200 | } 201 | # fn main() {} 202 | ``` 203 | 204 | [Global]: ../../std/alloc/struct.Global.html 205 | [handle_alloc_error]: ../../alloc/alloc/fn.handle_alloc_error.html 206 | [alloc]: ../../alloc/alloc/fn.alloc.html 207 | [realloc]: ../../alloc/alloc/fn.realloc.html 208 | [dealloc]: ../../alloc/alloc/fn.dealloc.html 209 | [std_alloc]: ../../alloc/alloc/index.html 210 | --------------------------------------------------------------------------------