├── .gitignore
├── src
├── arc-mutex
│ ├── arc-and-mutex.md
│ ├── arc.md
│ ├── arc-final.md
│ ├── arc-layout.md
│ ├── arc-clone.md
│ ├── arc-drop.md
│ └── arc-base.md
├── uninitialized.md
├── vec
│ ├── vec.md
│ ├── vec-dealloc.md
│ ├── vec-deref.md
│ ├── vec-insert-remove.md
│ ├── vec-layout.md
│ ├── vec-push-pop.md
│ ├── vec-raw.md
│ ├── vec-drain.md
│ ├── vec-into-iter.md
│ ├── vec-zsts.md
│ └── vec-alloc.md
├── data.md
├── concurrency.md
├── references.md
├── obrm.md
├── conversions.md
├── coercions.md
├── casts.md
├── unbounded-lifetimes.md
├── poisoning.md
├── panic-handler.md
├── constructors.md
├── SUMMARY.md
├── lifetime-elision.md
├── ownership.md
├── hrtb.md
├── unwinding.md
├── transmutes.md
├── beneath-std.md
├── intro.md
├── checked-uninit.md
├── drop-flags.md
├── meet-safe-and-unsafe.md
├── lifetime-mismatch.md
├── races.md
├── working-with-unsafe.md
├── what-unsafe-does.md
├── repr-rust.md
├── dot-operator.md
├── aliasing.md
├── destructors.md
├── unchecked-uninit.md
├── exotic-sizes.md
├── exception-safety.md
├── other-reprs.md
├── borrow-splitting.md
└── safe-unsafe-meaning.md
├── CITATION.cff
├── theme
└── nomicon.css
├── LICENSE-MIT
├── book.toml
├── .github
└── workflows
│ └── main.yml
└── README.md
/.gitignore:
--------------------------------------------------------------------------------
1 | *.html
2 | book
3 |
4 | # linkcheck stuff
5 | linkcheck
6 | linkchecker
7 | linkcheck.sh
8 |
--------------------------------------------------------------------------------
/src/arc-mutex/arc-and-mutex.md:
--------------------------------------------------------------------------------
1 | # Implementing Arc and Mutex
2 |
3 | Knowing the theory is all fine and good, but the *best* way to understand
4 | something is to use it. To better understand atomics and interior mutability,
5 | we'll be implementing versions of the standard library's `Arc` and `Mutex` types.
6 |
7 | TODO: Write `Mutex` chapters.
8 |
--------------------------------------------------------------------------------
/CITATION.cff:
--------------------------------------------------------------------------------
1 | cff-version: 1.2.0
2 | message: If you use this book, please cite it using these metadata.
3 | title: The Rustonomicon
4 | abstract: The Dark Arts of Advanced and Unsafe Rust Programming
5 | authors:
6 | - name: "The Rust Project Developers"
7 | date-released: "2017-03-03"
8 | license: "MIT OR Apache-2.0"
9 | repository-code: "https://github.com/rust-lang/nomicon"
10 |
--------------------------------------------------------------------------------
/src/uninitialized.md:
--------------------------------------------------------------------------------
1 | # Working With Uninitialized Memory
2 |
3 | All runtime-allocated memory in a Rust program begins its life as
4 | *uninitialized*. In this state the value of the memory is an indeterminate pile
5 | of bits that may or may not even reflect a valid state for the type that is
6 | supposed to inhabit that location of memory. Attempting to interpret this memory
7 | as a value of *any* type will cause Undefined Behavior. Do Not Do This.
8 |
9 | Rust provides mechanisms to work with uninitialized memory in checked (safe) and
10 | unchecked (unsafe) ways.
11 |
--------------------------------------------------------------------------------
/src/arc-mutex/arc.md:
--------------------------------------------------------------------------------
1 | # Implementing Arc
2 |
3 | In this section, we'll be implementing a simpler version of `std::sync::Arc`.
4 | Similarly to [the implementation of `Vec` we made earlier](../vec/vec.md), we won't be
5 | taking advantage of as many optimizations, intrinsics, or unstable code as the
6 | standard library may.
7 |
8 | This implementation is loosely based on the standard library's implementation
9 | (technically taken from `alloc::sync` in 1.49, as that's where it's actually
10 | implemented), but it will not support weak references at the moment as they
11 | make the implementation slightly more complex.
12 |
13 | Please note that this section is very work-in-progress at the moment.
14 |
--------------------------------------------------------------------------------
/src/vec/vec.md:
--------------------------------------------------------------------------------
1 | # Example: Implementing Vec
2 |
3 | To bring everything together, we're going to write `std::Vec` from scratch.
4 | We will limit ourselves to stable Rust. In particular we won't use any
5 | intrinsics that could make our code a little bit nicer or efficient because
6 | intrinsics are permanently unstable. Although many intrinsics *do* become
7 | stabilized elsewhere (`std::ptr` and `std::mem` consist of many intrinsics).
8 |
9 | Ultimately this means our implementation may not take advantage of all
10 | possible optimizations, though it will be by no means *naive*. We will
11 | definitely get into the weeds over nitty-gritty details, even
12 | when the problem doesn't *really* merit it.
13 |
14 | You wanted advanced. We're gonna go advanced.
15 |
--------------------------------------------------------------------------------
/src/data.md:
--------------------------------------------------------------------------------
1 | # Data Representation in Rust
2 |
3 | Low-level programming cares a lot about data layout. It's a big deal. It also
4 | pervasively influences the rest of the language, so we're going to start by
5 | digging into how data is represented in Rust.
6 |
7 | This chapter is ideally in agreement with, and rendered redundant by,
8 | the [Type Layout section of the Reference][ref-type-layout]. When this
9 | book was first written, the reference was in complete disrepair, and the
10 | Rustonomicon was attempting to serve as a partial replacement for the reference.
11 | This is no longer the case, so this whole chapter can ideally be deleted.
12 |
13 | We'll keep this chapter around for a bit longer, but ideally you should be
14 | contributing any new facts or improvements to the Reference instead.
15 |
16 | [ref-type-layout]: ../reference/type-layout.html
17 |
--------------------------------------------------------------------------------
/src/concurrency.md:
--------------------------------------------------------------------------------
1 | # Concurrency and Parallelism
2 |
3 | Rust as a language doesn't *really* have an opinion on how to do concurrency or
4 | parallelism. The standard library exposes OS threads and blocking sys-calls
5 | because everyone has those, and they're uniform enough that you can provide
6 | an abstraction over them in a relatively uncontroversial way. Message passing,
7 | green threads, and async APIs are all diverse enough that any abstraction over
8 | them tends to involve trade-offs that we weren't willing to commit to for 1.0.
9 |
10 | However the way Rust models concurrency makes it relatively easy to design your own
11 | concurrency paradigm as a library and have everyone else's code Just Work
12 | with yours. Just require the right lifetimes and Send and Sync where appropriate
13 | and you're off to the races. Or rather, off to the... not... having... races.
14 |
--------------------------------------------------------------------------------
/theme/nomicon.css:
--------------------------------------------------------------------------------
1 | /*
2 | Taken from the reference.
3 | Warnings and notes:
4 | Write the
s on their own line. E.g.
5 |
6 | Warning: This is bad!
7 |
8 | */
9 | main .warning p {
10 | padding: 10px 20px;
11 | margin: 20px 0;
12 | }
13 |
14 | main .warning p::before {
15 | content: "⚠️ ";
16 | }
17 |
18 | .light main .warning p,
19 | .rust main .warning p {
20 | border: 2px solid red;
21 | background: #ffcece;
22 | }
23 |
24 | .rust main .warning p {
25 | /* overrides previous declaration */
26 | border-color: #961717;
27 | }
28 |
29 | .coal main .warning p,
30 | .navy main .warning p,
31 | .ayu main .warning p {
32 | background: #542626
33 | }
34 |
35 | /* Make the links higher contrast on dark themes */
36 | .coal main .warning p a,
37 | .navy main .warning p a,
38 | .ayu main .warning p a {
39 | color: #80d0d0
40 | }
41 |
--------------------------------------------------------------------------------
/src/references.md:
--------------------------------------------------------------------------------
1 | # References
2 |
3 | There are two kinds of references:
4 |
5 | * Shared reference: `&`
6 | * Mutable reference: `&mut`
7 |
8 | Which obey the following rules:
9 |
10 | * A reference cannot outlive its referent
11 | * A mutable reference cannot be aliased
12 |
13 | That's it. That's the whole model references follow.
14 |
15 | Of course, we should probably define what *aliased* means.
16 |
17 | ```text
18 | error[E0425]: cannot find value `aliased` in this scope
19 | --> :2:20
20 | |
21 | 2 | println!("{}", aliased);
22 | | ^^^^^^^ not found in this scope
23 |
24 | error: aborting due to previous error
25 | ```
26 |
27 | Unfortunately, Rust hasn't actually defined its aliasing model. 🙀
28 |
29 | While we wait for the Rust devs to specify the semantics of their language,
30 | let's use the next section to discuss what aliasing is in general, and why it
31 | matters.
32 |
--------------------------------------------------------------------------------
/src/obrm.md:
--------------------------------------------------------------------------------
1 | # The Perils Of Ownership Based Resource Management (OBRM)
2 |
3 | OBRM (AKA RAII: Resource Acquisition Is Initialization) is something you'll
4 | interact with a lot in Rust. Especially if you use the standard library.
5 |
6 | Roughly speaking the pattern is as follows: to acquire a resource, you create an
7 | object that manages it. To release the resource, you simply destroy the object,
8 | and it cleans up the resource for you. The most common "resource" this pattern
9 | manages is simply *memory*. `Box`, `Rc`, and basically everything in
10 | `std::collections` is a convenience to enable correctly managing memory. This is
11 | particularly important in Rust because we have no pervasive GC to rely on for
12 | memory management. Which is the point, really: Rust is about control. However we
13 | are not limited to just memory. Pretty much every other system resource like a
14 | thread, file, or socket is exposed through this kind of API.
15 |
--------------------------------------------------------------------------------
/src/vec/vec-dealloc.md:
--------------------------------------------------------------------------------
1 | # Deallocating
2 |
3 | Next we should implement Drop so that we don't massively leak tons of resources.
4 | The easiest way is to just call `pop` until it yields None, and then deallocate
5 | our buffer. Note that calling `pop` is unneeded if `T: !Drop`. In theory we can
6 | ask Rust if `T` `needs_drop` and omit the calls to `pop`. However in practice
7 | LLVM is *really* good at removing simple side-effect free code like this, so I
8 | wouldn't bother unless you notice it's not being stripped (in this case it is).
9 |
10 | We must not call `alloc::dealloc` when `self.cap == 0`, as in this case we
11 | haven't actually allocated any memory.
12 |
13 |
14 | ```rust,ignore
15 | impl Drop for Vec {
16 | fn drop(&mut self) {
17 | if self.cap != 0 {
18 | while let Some(_) = self.pop() { }
19 | let layout = Layout::array::(self.cap).unwrap();
20 | unsafe {
21 | alloc::dealloc(self.ptr.as_ptr() as *mut u8, layout);
22 | }
23 | }
24 | }
25 | }
26 | ```
27 |
--------------------------------------------------------------------------------
/src/conversions.md:
--------------------------------------------------------------------------------
1 | # Type Conversions
2 |
3 | At the end of the day, everything is just a pile of bits somewhere, and type
4 | systems are just there to help us use those bits right. There are two common
5 | problems with typing bits: needing to reinterpret those exact bits as a
6 | different type, and needing to change the bits to have equivalent meaning for
7 | a different type. Because Rust encourages encoding important properties in the
8 | type system, these problems are incredibly pervasive. As such, Rust
9 | consequently gives you several ways to solve them.
10 |
11 | First we'll look at the ways that Safe Rust gives you to reinterpret values.
12 | The most trivial way to do this is to just destructure a value into its
13 | constituent parts and then build a new type out of them. e.g.
14 |
15 | ```rust
16 | struct Foo {
17 | x: u32,
18 | y: u16,
19 | }
20 |
21 | struct Bar {
22 | a: u32,
23 | b: u16,
24 | }
25 |
26 | fn reinterpret(foo: Foo) -> Bar {
27 | let Foo { x, y } = foo;
28 | Bar { a: x, b: y }
29 | }
30 | ```
31 |
32 | But this is, at best, annoying. For common conversions, Rust provides
33 | more ergonomic alternatives.
34 |
--------------------------------------------------------------------------------
/LICENSE-MIT:
--------------------------------------------------------------------------------
1 | Copyright (c) 2010 The Rust Project Developers
2 |
3 | Permission is hereby granted, free of charge, to any
4 | person obtaining a copy of this software and associated
5 | documentation files (the "Software"), to deal in the
6 | Software without restriction, including without
7 | limitation the rights to use, copy, modify, merge,
8 | publish, distribute, sublicense, and/or sell copies of
9 | the Software, and to permit persons to whom the Software
10 | is furnished to do so, subject to the following
11 | conditions:
12 |
13 | The above copyright notice and this permission notice
14 | shall be included in all copies or substantial portions
15 | of the Software.
16 |
17 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF
18 | ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED
19 | TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
20 | PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT
21 | SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
22 | CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
23 | OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR
24 | IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
25 | DEALINGS IN THE SOFTWARE.
26 |
--------------------------------------------------------------------------------
/book.toml:
--------------------------------------------------------------------------------
1 | [book]
2 | authors = ["The Rust Project Developers"]
3 | title = "The Rustonomicon"
4 | description = "The Dark Arts of Advanced and Unsafe Rust Programming"
5 |
6 | [output.html]
7 | additional-css = ["theme/nomicon.css"]
8 | git-repository-url = "https://github.com/rust-lang/nomicon"
9 |
10 | [output.html.redirect]
11 | # Vec-related chapters.
12 | "./vec-alloc.html" = "./vec/vec-alloc.html"
13 | "./vec-dealloc.html" = "./vec/vec-dealloc.html"
14 | "./vec-deref.html" = "./vec/vec-deref.html"
15 | "./vec-drain.html" = "./vec/vec-drain.html"
16 | "./vec-final.html" = "./vec/vec-final.html"
17 | "./vec-insert-remove.html" = "./vec/vec-insert-remove.html"
18 | "./vec-into-iter.html" = "./vec/vec-into-iter.html"
19 | "./vec-layout.html" = "./vec/vec-layout.html"
20 | "./vec-push-pop.html" = "./vec/vec-push-pop.html"
21 | "./vec-raw.html" = "./vec/vec-raw.html"
22 | "./vec-zsts.html" = "./vec/vec-zsts.html"
23 | "./vec.html" = "./vec/vec.html"
24 |
25 | # Arc and Mutex related chapters.
26 | "./arc-and-mutex.html" = "./arc-mutex/arc-and-mutex.html"
27 | "./arc-base.html" = "./arc-mutex/arc-base.html"
28 | "./arc-clone.html" = "./arc-mutex/arc-clone.html"
29 | "./arc-drop.html" = "./arc-mutex/arc-drop.html"
30 | "./arc-final.html" = "./arc-mutex/arc-final.html"
31 | "./arc-layout.html" = "./arc-mutex/arc-layout.html"
32 | "./arc.html" = "./arc-mutex/arc.html"
33 |
34 | [rust]
35 | edition = "2024"
36 |
--------------------------------------------------------------------------------
/src/vec/vec-deref.md:
--------------------------------------------------------------------------------
1 | # Deref
2 |
3 | Alright! We've got a decent minimal stack implemented. We can push, we can
4 | pop, and we can clean up after ourselves. However there's a whole mess of
5 | functionality we'd reasonably want. In particular, we have a proper array, but
6 | none of the slice functionality. That's actually pretty easy to solve: we can
7 | implement `Deref`. This will magically make our Vec coerce to, and
8 | behave like, a slice in all sorts of conditions.
9 |
10 | All we need is `slice::from_raw_parts`. It will correctly handle empty slices
11 | for us. Later once we set up zero-sized type support it will also Just Work
12 | for those too.
13 |
14 |
15 | ```rust,ignore
16 | use std::ops::Deref;
17 |
18 | impl Deref for Vec {
19 | type Target = [T];
20 | fn deref(&self) -> &[T] {
21 | unsafe {
22 | std::slice::from_raw_parts(self.ptr.as_ptr(), self.len)
23 | }
24 | }
25 | }
26 | ```
27 |
28 | And let's do DerefMut too:
29 |
30 |
31 | ```rust,ignore
32 | use std::ops::DerefMut;
33 |
34 | impl DerefMut for Vec {
35 | fn deref_mut(&mut self) -> &mut [T] {
36 | unsafe {
37 | std::slice::from_raw_parts_mut(self.ptr.as_ptr(), self.len)
38 | }
39 | }
40 | }
41 | ```
42 |
43 | Now we have `len`, `first`, `last`, indexing, slicing, sorting, `iter`,
44 | `iter_mut`, and all other sorts of bells and whistles provided by slice. Sweet!
45 |
--------------------------------------------------------------------------------
/src/coercions.md:
--------------------------------------------------------------------------------
1 | # Coercions
2 |
3 | Types can implicitly be coerced to change in certain contexts.
4 | These changes are generally just *weakening* of types, largely focused around pointers and lifetimes.
5 | They mostly exist to make Rust "just work" in more cases, and are largely harmless.
6 |
7 | For an exhaustive list of all the types of coercions, see the [Coercion types] section on the reference.
8 |
9 | Note that we do not perform coercions when matching traits (except for receivers, see the [next page][dot-operator]).
10 | If there is an `impl` for some type `U` and `T` coerces to `U`, that does not constitute an implementation for `T`.
11 | For example, the following will not type check, even though it is OK to coerce `t` to `&T` and there is an `impl` for `&T`:
12 |
13 | ```rust,compile_fail
14 | trait Trait {}
15 |
16 | fn foo(t: X) {}
17 |
18 | impl<'a> Trait for &'a i32 {}
19 |
20 | fn main() {
21 | let t: &mut i32 = &mut 0;
22 | foo(t);
23 | }
24 | ```
25 |
26 | which fails like as follows:
27 |
28 | ```text
29 | error[E0277]: the trait bound `&mut i32: Trait` is not satisfied
30 | --> src/main.rs:9:9
31 | |
32 | 3 | fn foo(t: X) {}
33 | | ----- required by this bound in `foo`
34 | ...
35 | 9 | foo(t);
36 | | ^ the trait `Trait` is not implemented for `&mut i32`
37 | |
38 | = help: the following implementations were found:
39 | <&'a i32 as Trait>
40 | = note: `Trait` is implemented for `&i32`, but not for `&mut i32`
41 | ```
42 |
43 | [Coercion types]: ../reference/type-coercions.html#coercion-types
44 | [dot-operator]: ./dot-operator.html
45 |
--------------------------------------------------------------------------------
/src/casts.md:
--------------------------------------------------------------------------------
1 | # Casts
2 |
3 | Casts are a superset of coercions: every coercion can be explicitly invoked via a cast.
4 | However some conversions require a cast.
5 | While coercions are pervasive and largely harmless, these "true casts" are rare and potentially dangerous.
6 | As such, casts must be explicitly invoked using the `as` keyword: `expr as Type`.
7 |
8 | You can find an exhaustive list of [all the true casts][cast list] and [casting semantics][semantics list] on the reference.
9 |
10 | ## Safety of casting
11 |
12 | True casts generally revolve around raw pointers and the primitive numeric types.
13 | Even though they're dangerous, these casts are infallible at runtime.
14 | If a cast triggers some subtle corner case no indication will be given that this occurred.
15 | The cast will simply succeed.
16 | That said, casts must be valid at the type level, or else they will be prevented statically.
17 | For instance, `7u8 as bool` will not compile.
18 |
19 | That said, casts aren't `unsafe` because they generally can't violate memory safety *on their own*.
20 | For instance, converting an integer to a raw pointer can very easily lead to terrible things.
21 | However the act of creating the pointer itself is safe, because actually using a raw pointer is already marked as `unsafe`.
22 |
23 | ## Some notes about casting
24 |
25 | ### Lengths when casting raw slices
26 |
27 | Note that lengths are not adjusted when casting raw slices; `*const [u16] as *const [u8]` creates a slice that only includes half of the original memory.
28 |
29 | ### Transitivity
30 |
31 | Casting is not transitive, that is, even if `e as U1 as U2` is a valid expression, `e as U2` is not necessarily so.
32 |
33 | [cast list]: ../reference/expressions/operator-expr.html#type-cast-expressions
34 | [semantics list]: ../reference/expressions/operator-expr.html#semantics
35 |
--------------------------------------------------------------------------------
/.github/workflows/main.yml:
--------------------------------------------------------------------------------
1 | name: CI
2 | on:
3 | pull_request:
4 | merge_group:
5 |
6 | env:
7 | MDBOOK_VERSION: 0.5.1
8 |
9 | jobs:
10 | test:
11 | name: Test
12 | runs-on: ubuntu-latest
13 | steps:
14 | - uses: actions/checkout@v4
15 | - name: Update rustup
16 | run: rustup self update
17 | - name: Install Rust
18 | run: |
19 | rustup set profile minimal
20 | rustup toolchain install nightly -c rust-docs
21 | rustup default nightly
22 | - name: Install mdbook
23 | run: |
24 | mkdir bin
25 | curl -sSL https://github.com/rust-lang/mdBook/releases/download/v${MDBOOK_VERSION}/mdbook-v${MDBOOK_VERSION}-x86_64-unknown-linux-gnu.tar.gz | tar -xz --directory=bin
26 | echo "$(pwd)/bin" >> $GITHUB_PATH
27 | - name: Report versions
28 | run: |
29 | rustup --version
30 | rustc -Vv
31 | mdbook --version
32 | - name: Run tests
33 | run: mdbook test
34 | - name: Check for broken links
35 | run: |
36 | curl -sSLo linkcheck.sh \
37 | https://raw.githubusercontent.com/rust-lang/rust/master/src/tools/linkchecker/linkcheck.sh
38 | sh linkcheck.sh --all nomicon
39 |
40 | # The success job is here to consolidate the total success/failure state of
41 | # all other jobs. This job is then included in the GitHub branch protection
42 | # rule which prevents merges unless all other jobs are passing. This makes
43 | # it easier to manage the list of jobs via this yml file and to prevent
44 | # accidentally adding new jobs without also updating the branch protections.
45 | success:
46 | name: Success gate
47 | if: always()
48 | needs:
49 | - test
50 | runs-on: ubuntu-latest
51 | steps:
52 | - run: jq --exit-status 'all(.result == "success")' <<< '${{ toJson(needs) }}'
53 | - name: Done
54 | run: exit 0
55 |
--------------------------------------------------------------------------------
/src/vec/vec-insert-remove.md:
--------------------------------------------------------------------------------
1 | # Insert and Remove
2 |
3 | Something *not* provided by slice is `insert` and `remove`, so let's do those
4 | next.
5 |
6 | Insert needs to shift all the elements at the target index to the right by one.
7 | To do this we need to use `ptr::copy`, which is our version of C's `memmove`.
8 | This copies some chunk of memory from one location to another, correctly
9 | handling the case where the source and destination overlap (which will
10 | definitely happen here).
11 |
12 | If we insert at index `i`, we want to shift the `[i .. len]` to `[i+1 .. len+1]`
13 | using the old len.
14 |
15 |
16 | ```rust,ignore
17 | pub fn insert(&mut self, index: usize, elem: T) {
18 | // Note: `<=` because it's valid to insert after everything
19 | // which would be equivalent to push.
20 | assert!(index <= self.len, "index out of bounds");
21 | if self.len == self.cap { self.grow(); }
22 |
23 | unsafe {
24 | // ptr::copy(src, dest, len): "copy from src to dest len elems"
25 | ptr::copy(
26 | self.ptr.as_ptr().add(index),
27 | self.ptr.as_ptr().add(index + 1),
28 | self.len - index,
29 | );
30 | ptr::write(self.ptr.as_ptr().add(index), elem);
31 | }
32 |
33 | self.len += 1;
34 | }
35 | ```
36 |
37 | Remove behaves in the opposite manner. We need to shift all the elements from
38 | `[i+1 .. len + 1]` to `[i .. len]` using the *new* len.
39 |
40 |
41 | ```rust,ignore
42 | pub fn remove(&mut self, index: usize) -> T {
43 | // Note: `<` because it's *not* valid to remove after everything
44 | assert!(index < self.len, "index out of bounds");
45 | unsafe {
46 | self.len -= 1;
47 | let result = ptr::read(self.ptr.as_ptr().add(index));
48 | ptr::copy(
49 | self.ptr.as_ptr().add(index + 1),
50 | self.ptr.as_ptr().add(index),
51 | self.len - index,
52 | );
53 | result
54 | }
55 | }
56 | ```
57 |
--------------------------------------------------------------------------------
/src/unbounded-lifetimes.md:
--------------------------------------------------------------------------------
1 | # Unbounded Lifetimes
2 |
3 | Unsafe code can often end up producing references or lifetimes out of thin air.
4 | Such lifetimes come into the world as *unbounded*. The most common source of
5 | this is taking a reference to a dereferenced raw pointer, which produces a
6 | reference with an unbounded lifetime. Such a lifetime becomes as big as context
7 | demands. This is in fact more powerful than simply becoming `'static`, because
8 | for instance `&'static &'a T` will fail to typecheck, but the unbound lifetime
9 | will perfectly mold into `&'a &'a T` as needed. However for most intents and
10 | purposes, such an unbounded lifetime can be regarded as `'static`.
11 |
12 | Almost no reference is `'static`, so this is probably wrong. `transmute` and
13 | `transmute_copy` are the two other primary offenders. One should endeavor to
14 | bound an unbounded lifetime as quickly as possible, especially across function
15 | boundaries.
16 |
17 | Given a function, any output lifetimes that don't derive from inputs are
18 | unbounded. For instance:
19 |
20 |
21 | ```rust,no_run
22 | fn get_str<'a>(s: *const String) -> &'a str {
23 | unsafe { &*s }
24 | }
25 |
26 | fn main() {
27 | let soon_dropped = String::from("hello");
28 | let dangling = get_str(&soon_dropped);
29 | drop(soon_dropped);
30 | println!("Invalid str: {}", dangling); // Invalid str: gӚ_`
31 | }
32 | ```
33 |
34 | The easiest way to avoid unbounded lifetimes is to use lifetime elision at the
35 | function boundary. If an output lifetime is elided, then it *must* be bounded by
36 | an input lifetime. Of course it might be bounded by the *wrong* lifetime, but
37 | this will usually just cause a compiler error, rather than allow memory safety
38 | to be trivially violated.
39 |
40 | Within a function, bounding lifetimes is more error-prone. The safest and easiest
41 | way to bound a lifetime is to return it from a function with a bound lifetime.
42 | However if this is unacceptable, the reference can be placed in a location with
43 | a specific lifetime. Unfortunately it's impossible to name all lifetimes involved
44 | in a function.
45 |
--------------------------------------------------------------------------------
/src/vec/vec-layout.md:
--------------------------------------------------------------------------------
1 | # Layout
2 |
3 | First off, we need to come up with the struct layout. A Vec has three parts:
4 | a pointer to the allocation, the size of the allocation, and the number of
5 | elements that have been initialized.
6 |
7 | Naively, this means we just want this design:
8 |
9 |
10 | ```rust,ignore
11 | pub struct Vec {
12 | ptr: *mut T,
13 | cap: usize,
14 | len: usize,
15 | }
16 | ```
17 |
18 | And indeed this would compile. Unfortunately, it would be too strict. The
19 | compiler will give us too strict variance. So a `&Vec<&'static str>`
20 | couldn't be used where a `&Vec<&'a str>` was expected. See [the chapter
21 | on ownership and lifetimes][ownership] for all the details on variance.
22 |
23 | As we saw in the ownership chapter, the standard library uses `Unique` in place of
24 | `*mut T` when it has a raw pointer to an allocation that it owns. Unique is unstable,
25 | so we'd like to not use it if possible, though.
26 |
27 | As a recap, Unique is a wrapper around a raw pointer that declares that:
28 |
29 | * We are covariant over `T`
30 | * We may own a value of type `T` (this is not relevant for our example here, but see
31 | [the chapter on PhantomData][phantom-data] on why the real `std::vec::Vec` needs this)
32 | * We are Send/Sync if `T` is Send/Sync
33 | * Our pointer is never null (so `Option>` is null-pointer-optimized)
34 |
35 | We can implement all of the above requirements in stable Rust. To do this, instead
36 | of using `Unique` we will use [`NonNull`][NonNull], another wrapper around a
37 | raw pointer, which gives us two of the above properties, namely it is covariant
38 | over `T` and is declared to never be null. By implementing Send/Sync if `T` is,
39 | we get the same results as using `Unique`:
40 |
41 | ```rust
42 | use std::ptr::NonNull;
43 |
44 | pub struct Vec {
45 | ptr: NonNull,
46 | cap: usize,
47 | len: usize,
48 | }
49 |
50 | unsafe impl Send for Vec {}
51 | unsafe impl Sync for Vec {}
52 | # fn main() {}
53 | ```
54 |
55 | [ownership]: ../ownership.html
56 | [phantom-data]: ../phantom-data.md
57 | [NonNull]: ../../std/ptr/struct.NonNull.html
58 |
--------------------------------------------------------------------------------
/src/poisoning.md:
--------------------------------------------------------------------------------
1 | # Poisoning
2 |
3 | Although all unsafe code *must* ensure it has minimal exception safety, not all
4 | types ensure *maximal* exception safety. Even if the type does, your code may
5 | ascribe additional meaning to it. For instance, an integer is certainly
6 | exception-safe, but has no semantics on its own. It's possible that code that
7 | panics could fail to correctly update the integer, producing an inconsistent
8 | program state.
9 |
10 | This is *usually* fine, because anything that witnesses an exception is about
11 | to get destroyed. For instance, if you send a Vec to another thread and that
12 | thread panics, it doesn't matter if the Vec is in a weird state. It will be
13 | dropped and go away forever. However some types are especially good at smuggling
14 | values across the panic boundary.
15 |
16 | These types may choose to explicitly *poison* themselves if they witness a panic.
17 | Poisoning doesn't entail anything in particular. Generally it just means
18 | preventing normal usage from proceeding. The most notable example of this is the
19 | standard library's Mutex type. A Mutex will poison itself if one of its
20 | MutexGuards (the thing it returns when a lock is obtained) is dropped during a
21 | panic. Any future attempts to lock the Mutex will return an `Err` or panic.
22 |
23 | Mutex poisons not for true safety in the sense that Rust normally cares about. It
24 | poisons as a safety-guard against blindly using the data that comes out of a Mutex
25 | that has witnessed a panic while locked. The data in such a Mutex was likely in the
26 | middle of being modified, and as such may be in an inconsistent or incomplete state.
27 | It is important to note that one cannot violate memory safety with such a type
28 | if it is correctly written. After all, it must be minimally exception-safe!
29 |
30 | However if the Mutex contained, say, a BinaryHeap that does not actually have the
31 | heap property, it's unlikely that any code that uses it will do
32 | what the author intended. As such, the program should not proceed normally.
33 | Still, if you're double-plus-sure that you can do *something* with the value,
34 | the Mutex exposes a method to get the lock anyway. It *is* safe, after all.
35 | Just maybe nonsense.
36 |
--------------------------------------------------------------------------------
/src/vec/vec-push-pop.md:
--------------------------------------------------------------------------------
1 | # Push and Pop
2 |
3 | Alright. We can initialize. We can allocate. Let's actually implement some
4 | functionality! Let's start with `push`. All it needs to do is check if we're
5 | full to grow, unconditionally write to the next index, and then increment our
6 | length.
7 |
8 | To do the write we have to be careful not to evaluate the memory we want to write
9 | to. At worst, it's truly uninitialized memory from the allocator. At best it's the
10 | bits of some old value we popped off. Either way, we can't just index to the memory
11 | and dereference it, because that will evaluate the memory as a valid instance of
12 | T. Worse, `foo[idx] = x` will try to call `drop` on the old value of `foo[idx]`!
13 |
14 | The correct way to do this is with `ptr::write`, which just blindly overwrites the
15 | target address with the bits of the value we provide. No evaluation involved.
16 |
17 | For `push`, if the old len (before push was called) is 0, then we want to write
18 | to the 0th index. So we should offset by the old len.
19 |
20 |
21 | ```rust,ignore
22 | pub fn push(&mut self, elem: T) {
23 | if self.len == self.cap { self.grow(); }
24 |
25 | unsafe {
26 | ptr::write(self.ptr.as_ptr().add(self.len), elem);
27 | }
28 |
29 | // Can't fail, we'll OOM first.
30 | self.len += 1;
31 | }
32 | ```
33 |
34 | Easy! How about `pop`? Although this time the index we want to access is
35 | initialized, Rust won't just let us dereference the location of memory to move
36 | the value out, because that would leave the memory uninitialized! For this we
37 | need `ptr::read`, which just copies out the bits from the target address and
38 | interprets it as a value of type T. This will leave the memory at this address
39 | logically uninitialized, even though there is in fact a perfectly good instance
40 | of T there.
41 |
42 | For `pop`, if the old len is 1, for example, we want to read out of the 0th
43 | index. So we should offset by the new len.
44 |
45 |
46 | ```rust,ignore
47 | pub fn pop(&mut self) -> Option {
48 | if self.len == 0 {
49 | None
50 | } else {
51 | self.len -= 1;
52 | unsafe {
53 | Some(ptr::read(self.ptr.as_ptr().add(self.len)))
54 | }
55 | }
56 | }
57 | ```
58 |
--------------------------------------------------------------------------------
/src/panic-handler.md:
--------------------------------------------------------------------------------
1 | # #[panic_handler]
2 |
3 | `#[panic_handler]` is used to define the behavior of `panic!` in `#![no_std]` applications.
4 | The `#[panic_handler]` attribute must be applied to a function with signature `fn(&PanicInfo)
5 | -> !` and such function must appear *once* in the dependency graph of a binary / dylib / cdylib
6 | crate. The API of `PanicInfo` can be found in the [API docs].
7 |
8 | [API docs]: ../core/panic/struct.PanicInfo.html
9 |
10 | Given that `#![no_std]` applications have no *standard* output and that some `#![no_std]`
11 | applications, e.g. embedded applications, need different panicking behaviors for development and for
12 | release it can be helpful to have panic crates, crate that only contain a `#[panic_handler]`.
13 | This way applications can easily swap the panicking behavior by simply linking to a different panic
14 | crate.
15 |
16 | Below is shown an example where an application has a different panicking behavior depending on
17 | whether is compiled using the dev profile (`cargo build`) or using the release profile (`cargo build
18 | --release`).
19 |
20 | `panic-semihosting` crate -- log panic messages to the host stderr using semihosting:
21 |
22 |
23 | ```rust,ignore
24 | #![no_std]
25 |
26 | use core::fmt::{Write, self};
27 | use core::panic::PanicInfo;
28 |
29 | struct HStderr {
30 | // ..
31 | # _0: (),
32 | }
33 | #
34 | # impl HStderr {
35 | # fn new() -> HStderr { HStderr { _0: () } }
36 | # }
37 | #
38 | # impl fmt::Write for HStderr {
39 | # fn write_str(&mut self, _: &str) -> fmt::Result { Ok(()) }
40 | # }
41 |
42 | #[panic_handler]
43 | fn panic(info: &PanicInfo) -> ! {
44 | let mut host_stderr = HStderr::new();
45 |
46 | // logs "panicked at '$reason', src/main.rs:27:4" to the host stderr
47 | writeln!(host_stderr, "{}", info).ok();
48 |
49 | loop {}
50 | }
51 | ```
52 |
53 | `panic-halt` crate -- halt the thread on panic; messages are discarded:
54 |
55 |
56 | ```rust,ignore
57 | #![no_std]
58 |
59 | use core::panic::PanicInfo;
60 |
61 | #[panic_handler]
62 | fn panic(_info: &PanicInfo) -> ! {
63 | loop {}
64 | }
65 | ```
66 |
67 | `app` crate:
68 |
69 |
70 | ```rust,ignore
71 | #![no_std]
72 |
73 | // dev profile
74 | #[cfg(debug_assertions)]
75 | extern crate panic_semihosting;
76 |
77 | // release profile
78 | #[cfg(not(debug_assertions))]
79 | extern crate panic_halt;
80 |
81 | fn main() {
82 | // ..
83 | }
84 | ```
85 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # The Rustonomicon
2 |
3 | The Dark Arts of Advanced and Unsafe Rust Programming
4 |
5 | Nicknamed "the Nomicon."
6 |
7 | ## NOTE: This is a draft document, and may contain serious errors
8 |
9 | > Instead of the programs I had hoped for, there came only a shuddering
10 | blackness and ineffable loneliness; and I saw at last a fearful truth which no
11 | one had ever dared to breathe before — the unwhisperable secret of secrets — The
12 | fact that this language of stone and stridor is not a sentient perpetuation of
13 | Rust as London is of Old London and Paris of Old Paris, but that it is in fact
14 | quite unsafe, its sprawling body imperfectly embalmed and infested with queer
15 | animate things which have nothing to do with it as it was in compilation.
16 |
17 | This book digs into all the awful details that are necessary to understand in
18 | order to write correct Unsafe Rust programs. Due to the nature of this problem,
19 | it may lead to unleashing untold horrors that shatter your psyche into a billion
20 | infinitesimal fragments of despair.
21 |
22 | ## Requirements
23 |
24 | Building the Nomicon requires [mdBook]. To get it:
25 |
26 | [mdBook]: https://github.com/rust-lang/mdBook
27 |
28 | ```bash
29 | cargo install mdbook
30 | ```
31 |
32 | ### `mdbook` usage
33 |
34 | To build the Nomicon use the `build` sub-command:
35 |
36 | ```bash
37 | mdbook build
38 | ```
39 |
40 | The output will be placed in the `book` subdirectory. To check it out, open the
41 | `index.html` file in your web browser. You can pass the `--open` flag to `mdbook
42 | build` and it'll open the index page in your default browser (if the process is
43 | successful) just like with `cargo doc --open`:
44 |
45 | ```bash
46 | mdbook build --open
47 | ```
48 |
49 | There is also a `test` sub-command to test all code samples contained in the book:
50 |
51 | ```bash
52 | mdbook test
53 | ```
54 |
55 | ### `linkcheck`
56 |
57 | We use the `linkcheck` tool to find broken links.
58 | To run it locally:
59 |
60 | ```sh
61 | curl -sSLo linkcheck.sh https://raw.githubusercontent.com/rust-lang/rust/master/src/tools/linkchecker/linkcheck.sh
62 | sh linkcheck.sh --all nomicon
63 | ```
64 |
65 | ## Contributing
66 |
67 | Given that the Nomicon is still in a draft state, we'd love your help! Please
68 | feel free to open issues about anything, and send in PRs for things you'd like
69 | to fix or change. If your change is large, please open an issue first, so we can
70 | make sure that it's something we'd accept before you go through the work of
71 | getting a PR together.
72 |
--------------------------------------------------------------------------------
/src/constructors.md:
--------------------------------------------------------------------------------
1 | # Constructors
2 |
3 | There is exactly one way to create an instance of a user-defined type: name it,
4 | and initialize all its fields at once:
5 |
6 | ```rust
7 | struct Foo {
8 | a: u8,
9 | b: u32,
10 | c: bool,
11 | }
12 |
13 | enum Bar {
14 | X(u32),
15 | Y(bool),
16 | }
17 |
18 | struct Unit;
19 |
20 | let foo = Foo { a: 0, b: 1, c: false };
21 | let bar = Bar::X(0);
22 | let empty = Unit;
23 | ```
24 |
25 | That's it. Every other way you make an instance of a type is just calling a
26 | totally vanilla function that does some stuff and eventually bottoms out to The
27 | One True Constructor.
28 |
29 | Unlike C++, Rust does not come with a slew of built-in kinds of constructor.
30 | There are no Copy, Default, Assignment, Move, or whatever constructors. The
31 | reasons for this are varied, but it largely boils down to Rust's philosophy of
32 | *being explicit*.
33 |
34 | Move constructors are meaningless in Rust because we don't enable types to
35 | "care" about their location in memory. Every type must be ready for it to be
36 | blindly memcopied to somewhere else in memory. This means pure on-the-stack-but-
37 | still-movable intrusive linked lists are simply not happening in Rust (safely).
38 |
39 | Assignment and copy constructors similarly don't exist because move semantics
40 | are the only semantics in Rust. At most `x = y` just moves the bits of y into
41 | the x variable. Rust does provide two facilities for providing C++'s copy-
42 | oriented semantics: `Copy` and `Clone`. Clone is our moral equivalent of a copy
43 | constructor, but it's never implicitly invoked. You have to explicitly call
44 | `clone` on an element you want to be cloned. Copy is a special case of Clone
45 | where the implementation is just "copy the bits". Copy types *are* implicitly
46 | cloned whenever they're moved, but because of the definition of Copy this just
47 | means not treating the old copy as uninitialized -- a no-op.
48 |
49 | While Rust provides a `Default` trait for specifying the moral equivalent of a
50 | default constructor, it's incredibly rare for this trait to be used. This is
51 | because variables [aren't implicitly initialized][uninit]. Default is basically
52 | only useful for generic programming. In concrete contexts, a type will provide a
53 | static `new` method for any kind of "default" constructor. This has no relation
54 | to `new` in other languages and has no special meaning. It's just a naming
55 | convention.
56 |
57 | TODO: talk about "placement new"?
58 |
59 | [uninit]: uninitialized.html
60 |
--------------------------------------------------------------------------------
/src/arc-mutex/arc-final.md:
--------------------------------------------------------------------------------
1 | # Final Code
2 |
3 | Here's the final code, with some added comments and re-ordered imports:
4 |
5 | ```rust
6 | use std::marker::PhantomData;
7 | use std::ops::Deref;
8 | use std::ptr::NonNull;
9 | use std::sync::atomic::{self, AtomicUsize, Ordering};
10 |
11 | pub struct Arc {
12 | ptr: NonNull>,
13 | phantom: PhantomData>,
14 | }
15 |
16 | pub struct ArcInner {
17 | rc: AtomicUsize,
18 | data: T,
19 | }
20 |
21 | impl Arc {
22 | pub fn new(data: T) -> Arc {
23 | // We start the reference count at 1, as that first reference is the
24 | // current pointer.
25 | let boxed = Box::new(ArcInner {
26 | rc: AtomicUsize::new(1),
27 | data,
28 | });
29 | Arc {
30 | // It is okay to call `.unwrap()` here as we get a pointer from
31 | // `Box::into_raw` which is guaranteed to not be null.
32 | ptr: NonNull::new(Box::into_raw(boxed)).unwrap(),
33 | phantom: PhantomData,
34 | }
35 | }
36 | }
37 |
38 | unsafe impl Send for Arc {}
39 | unsafe impl Sync for Arc {}
40 |
41 | impl Deref for Arc {
42 | type Target = T;
43 |
44 | fn deref(&self) -> &T {
45 | let inner = unsafe { self.ptr.as_ref() };
46 | &inner.data
47 | }
48 | }
49 |
50 | impl Clone for Arc {
51 | fn clone(&self) -> Arc {
52 | let inner = unsafe { self.ptr.as_ref() };
53 | // Using a relaxed ordering is alright here as we don't need any atomic
54 | // synchronization here as we're not modifying or accessing the inner
55 | // data.
56 | let old_rc = inner.rc.fetch_add(1, Ordering::Relaxed);
57 |
58 | if old_rc >= isize::MAX as usize {
59 | std::process::abort();
60 | }
61 |
62 | Self {
63 | ptr: self.ptr,
64 | phantom: PhantomData,
65 | }
66 | }
67 | }
68 |
69 | impl Drop for Arc {
70 | fn drop(&mut self) {
71 | let inner = unsafe { self.ptr.as_ref() };
72 | if inner.rc.fetch_sub(1, Ordering::Release) != 1 {
73 | return;
74 | }
75 | // This fence is needed to prevent reordering of the use and deletion
76 | // of the data.
77 | atomic::fence(Ordering::Acquire);
78 | // This is safe as we know we have the last pointer to the `ArcInner`
79 | // and that its pointer is valid.
80 | unsafe { Box::from_raw(self.ptr.as_ptr()); }
81 | }
82 | }
83 | ```
84 |
--------------------------------------------------------------------------------
/src/SUMMARY.md:
--------------------------------------------------------------------------------
1 | # Summary
2 |
3 | [Introduction](intro.md)
4 |
5 | * [Meet Safe and Unsafe](meet-safe-and-unsafe.md)
6 | * [How Safe and Unsafe Interact](safe-unsafe-meaning.md)
7 | * [What Unsafe Can Do](what-unsafe-does.md)
8 | * [Working with Unsafe](working-with-unsafe.md)
9 | * [Data Layout](data.md)
10 | * [repr(Rust)](repr-rust.md)
11 | * [Exotically Sized Types](exotic-sizes.md)
12 | * [Other reprs](other-reprs.md)
13 | * [Ownership](ownership.md)
14 | * [References](references.md)
15 | * [Aliasing](aliasing.md)
16 | * [Lifetimes](lifetimes.md)
17 | * [Limits of Lifetimes](lifetime-mismatch.md)
18 | * [Lifetime Elision](lifetime-elision.md)
19 | * [Unbounded Lifetimes](unbounded-lifetimes.md)
20 | * [Higher-Rank Trait Bounds](hrtb.md)
21 | * [Subtyping and Variance](subtyping.md)
22 | * [Drop Check](dropck.md)
23 | * [PhantomData](phantom-data.md)
24 | * [Splitting Borrows](borrow-splitting.md)
25 | * [Type Conversions](conversions.md)
26 | * [Coercions](coercions.md)
27 | * [The Dot Operator](dot-operator.md)
28 | * [Casts](casts.md)
29 | * [Transmutes](transmutes.md)
30 | * [Uninitialized Memory](uninitialized.md)
31 | * [Checked](checked-uninit.md)
32 | * [Drop Flags](drop-flags.md)
33 | * [Unchecked](unchecked-uninit.md)
34 | * [Ownership Based Resource Management](obrm.md)
35 | * [Constructors](constructors.md)
36 | * [Destructors](destructors.md)
37 | * [Leaking](leaking.md)
38 | * [Unwinding](unwinding.md)
39 | * [Exception Safety](exception-safety.md)
40 | * [Poisoning](poisoning.md)
41 | * [Concurrency](concurrency.md)
42 | * [Races](races.md)
43 | * [Send and Sync](send-and-sync.md)
44 | * [Atomics](atomics.md)
45 | * [Implementing Vec](./vec/vec.md)
46 | * [Layout](./vec/vec-layout.md)
47 | * [Allocating](./vec/vec-alloc.md)
48 | * [Push and Pop](./vec/vec-push-pop.md)
49 | * [Deallocating](./vec/vec-dealloc.md)
50 | * [Deref](./vec/vec-deref.md)
51 | * [Insert and Remove](./vec/vec-insert-remove.md)
52 | * [IntoIter](./vec/vec-into-iter.md)
53 | * [RawVec](./vec/vec-raw.md)
54 | * [Drain](./vec/vec-drain.md)
55 | * [Handling Zero-Sized Types](./vec/vec-zsts.md)
56 | * [Final Code](./vec/vec-final.md)
57 | * [Implementing Arc and Mutex](./arc-mutex/arc-and-mutex.md)
58 | * [Arc](./arc-mutex/arc.md)
59 | * [Layout](./arc-mutex/arc-layout.md)
60 | * [Base Code](./arc-mutex/arc-base.md)
61 | * [Cloning](./arc-mutex/arc-clone.md)
62 | * [Dropping](./arc-mutex/arc-drop.md)
63 | * [Final Code](./arc-mutex/arc-final.md)
64 | * [FFI](ffi.md)
65 | * [Beneath `std`](beneath-std.md)
66 | * [#[panic_handler]](panic-handler.md)
67 |
--------------------------------------------------------------------------------
/src/lifetime-elision.md:
--------------------------------------------------------------------------------
1 | # Lifetime Elision
2 |
3 | In order to make common patterns more ergonomic, Rust allows lifetimes to be
4 | *elided* in function signatures.
5 |
6 | A *lifetime position* is anywhere you can write a lifetime in a type:
7 |
8 |
9 | ```rust,ignore
10 | &'a T
11 | &'a mut T
12 | T<'a>
13 | ```
14 |
15 | Lifetime positions can appear as either "input" or "output":
16 |
17 | * For `fn` definitions, `fn` types, and the traits `Fn`, `FnMut`, and `FnOnce`,
18 | input refers to the types of the formal arguments, while output refers to
19 | result types. So `fn foo(s: &str) -> (&str, &str)` has elided one lifetime in
20 | input position and two lifetimes in output position. Note that the input
21 | positions of a `fn` method definition do not include the lifetimes that occur
22 | in the method's `impl` header (nor lifetimes that occur in the trait header,
23 | for a default method).
24 |
25 | * For `impl` headers, all types are input. So `impl Trait<&T> for Struct<&T>`
26 | has elided two lifetimes in input position, while `impl Struct<&T>` has elided
27 | one.
28 |
29 | Elision rules are as follows:
30 |
31 | * Each elided lifetime in input position becomes a distinct lifetime
32 | parameter.
33 |
34 | * If there is exactly one input lifetime position (elided or not), that lifetime
35 | is assigned to *all* elided output lifetimes.
36 |
37 | * If there are multiple input lifetime positions, but one of them is `&self` or
38 | `&mut self`, the lifetime of `self` is assigned to *all* elided output lifetimes.
39 |
40 | * Otherwise, it is an error to elide an output lifetime.
41 |
42 | Examples:
43 |
44 |
45 | ```rust,ignore
46 | fn print(s: &str); // elided
47 | fn print<'a>(s: &'a str); // expanded
48 |
49 | fn debug(lvl: usize, s: &str); // elided
50 | fn debug<'a>(lvl: usize, s: &'a str); // expanded
51 |
52 | fn substr(s: &str, until: usize) -> &str; // elided
53 | fn substr<'a>(s: &'a str, until: usize) -> &'a str; // expanded
54 |
55 | fn get_str() -> &str; // ILLEGAL
56 |
57 | fn frob(s: &str, t: &str) -> &str; // ILLEGAL
58 |
59 | fn get_mut(&mut self) -> &mut T; // elided
60 | fn get_mut<'a>(&'a mut self) -> &'a mut T; // expanded
61 |
62 | fn args(&mut self, args: &[T]) -> &mut Command // elided
63 | fn args<'a, 'b, T: ToCStr>(&'a mut self, args: &'b [T]) -> &'a mut Command // expanded
64 |
65 | fn new(buf: &mut [u8]) -> BufWriter; // elided
66 | fn new(buf: &mut [u8]) -> BufWriter<'_>; // elided (with `rust_2018_idioms`)
67 | fn new<'a>(buf: &'a mut [u8]) -> BufWriter<'a> // expanded
68 | ```
69 |
--------------------------------------------------------------------------------
/src/ownership.md:
--------------------------------------------------------------------------------
1 | # Ownership and Lifetimes
2 |
3 | Ownership is the breakout feature of Rust. It allows Rust to be completely
4 | memory-safe and efficient, while avoiding garbage collection. Before getting
5 | into the ownership system in detail, we will consider the motivation of this
6 | design.
7 |
8 | We will assume that you accept that garbage collection (GC) is not always an
9 | optimal solution, and that it is desirable to manually manage memory in some
10 | contexts. If you do not accept this, might I interest you in a different
11 | language?
12 |
13 | Regardless of your feelings on GC, it is pretty clearly a *massive* boon to
14 | making code safe. You never have to worry about things going away *too soon*
15 | (although whether you still wanted to be pointing at that thing is a different
16 | issue...). This is a pervasive problem that C and C++ programs need to deal
17 | with. Consider this simple mistake that all of us who have used a non-GC'd
18 | language have made at one point:
19 |
20 | ```rust,compile_fail
21 | fn as_str(data: &u32) -> &str {
22 | // compute the string
23 | let s = format!("{}", data);
24 |
25 | // OH NO! We returned a reference to something that
26 | // exists only in this function!
27 | // Dangling pointer! Use after free! Alas!
28 | // (this does not compile in Rust)
29 | &s
30 | }
31 | ```
32 |
33 | This is exactly what Rust's ownership system was built to solve.
34 | Rust knows the scope in which the `&s` lives, and as such can prevent it from
35 | escaping. However this is a simple case that even a C compiler could plausibly
36 | catch. Things get more complicated as code gets bigger and pointers get fed through
37 | various functions. Eventually, a C compiler will fall down and won't be able to
38 | perform sufficient escape analysis to prove your code unsound. It will consequently
39 | be forced to accept your program on the assumption that it is correct.
40 |
41 | This will never happen to Rust. It's up to the programmer to prove to the
42 | compiler that everything is sound.
43 |
44 | Of course, Rust's story around ownership is much more complicated than just
45 | verifying that references don't escape the scope of their referent. That's
46 | because ensuring pointers are always valid is much more complicated than this.
47 | For instance in this code,
48 |
49 | ```rust,compile_fail
50 | let mut data = vec![1, 2, 3];
51 | // get an internal reference
52 | let x = &data[0];
53 |
54 | // OH NO! `push` causes the backing storage of `data` to be reallocated.
55 | // Dangling pointer! Use after free! Alas!
56 | // (this does not compile in Rust)
57 | data.push(4);
58 |
59 | println!("{}", x);
60 | ```
61 |
62 | naive scope analysis would be insufficient to prevent this bug, because `data`
63 | does in fact live as long as we needed. However it was *changed* while we had
64 | a reference into it. This is why Rust requires any references to freeze the
65 | referent and its owners.
66 |
--------------------------------------------------------------------------------
/src/arc-mutex/arc-layout.md:
--------------------------------------------------------------------------------
1 | # Layout
2 |
3 | Let's start by making the layout for our implementation of `Arc`.
4 |
5 | An `Arc` provides thread-safe shared ownership of a value of type `T`,
6 | allocated in the heap. Sharing implies immutability in Rust, so we don't need to
7 | design anything that manages access to that value, right? Although interior
8 | mutability types like Mutex allow Arc's users to create shared mutability, Arc
9 | itself doesn't need to concern itself with these issues.
10 |
11 | However there _is_ one place where Arc needs to concern itself with mutation:
12 | destruction. When all the owners of the Arc go away, we need to be able to
13 | `drop` its contents and free its allocation. So we need a way for an owner to
14 | know if it's the _last_ owner, and the simplest way to do that is with a count
15 | of the owners -- Reference Counting.
16 |
17 | Unfortunately, this reference count is inherently shared mutable state, so Arc
18 | _does_ need to think about synchronization. We _could_ use a Mutex for this, but
19 | that's overkill. Instead, we'll use atomics. And since everyone already needs a
20 | pointer to the T's allocation, we might as well put the reference count in that
21 | same allocation.
22 |
23 | Naively, it would look something like this:
24 |
25 | ```rust
26 | use std::sync::atomic;
27 |
28 | pub struct Arc {
29 | ptr: *mut ArcInner,
30 | }
31 |
32 | pub struct ArcInner {
33 | rc: atomic::AtomicUsize,
34 | data: T,
35 | }
36 | ```
37 |
38 | This would compile, however it would be incorrect. First of all, the compiler
39 | will give us too strict variance. For example, an `Arc<&'static str>` couldn't
40 | be used where an `Arc<&'a str>` was expected. More importantly, it will give
41 | incorrect ownership information to the drop checker, as it will assume we don't
42 | own any values of type `T`. As this is a structure providing shared ownership of
43 | a value, at some point there will be an instance of this structure that entirely
44 | owns its data. See [the chapter on ownership and lifetimes](../ownership.md) for
45 | all the details on variance and drop check.
46 |
47 | To fix the first problem, we can use `NonNull`. Note that `NonNull` is a
48 | wrapper around a raw pointer that declares that:
49 |
50 | * We are covariant over `T`
51 | * Our pointer is never null
52 |
53 | To fix the second problem, we can include a `PhantomData` marker containing an
54 | `ArcInner`. This will tell the drop checker that we have some notion of
55 | ownership of a value of `ArcInner` (which itself contains some `T`).
56 |
57 | With these changes we get our final structure:
58 |
59 | ```rust
60 | use std::marker::PhantomData;
61 | use std::ptr::NonNull;
62 | use std::sync::atomic::AtomicUsize;
63 |
64 | pub struct Arc {
65 | ptr: NonNull>,
66 | phantom: PhantomData>,
67 | }
68 |
69 | pub struct ArcInner {
70 | rc: AtomicUsize,
71 | data: T,
72 | }
73 | ```
74 |
--------------------------------------------------------------------------------
/src/hrtb.md:
--------------------------------------------------------------------------------
1 | # Higher-Rank Trait Bounds (HRTBs)
2 |
3 | Rust's `Fn` traits are a little bit magic. For instance, we can write the
4 | following code:
5 |
6 | ```rust
7 | struct Closure {
8 | data: (u8, u16),
9 | func: F,
10 | }
11 |
12 | impl Closure
13 | where F: Fn(&(u8, u16)) -> &u8,
14 | {
15 | fn call(&self) -> &u8 {
16 | (self.func)(&self.data)
17 | }
18 | }
19 |
20 | fn do_it(data: &(u8, u16)) -> &u8 { &data.0 }
21 |
22 | fn main() {
23 | let clo = Closure { data: (0, 1), func: do_it };
24 | println!("{}", clo.call());
25 | }
26 | ```
27 |
28 | If we try to naively desugar this code in the same way that we did in the
29 | [lifetimes section][lt], we run into some trouble:
30 |
31 |
32 | ```rust,ignore
33 | // NOTE: `&'b data.0` and `'x: {` is not valid syntax!
34 | struct Closure {
35 | data: (u8, u16),
36 | func: F,
37 | }
38 |
39 | impl Closure
40 | // where F: Fn(&'??? (u8, u16)) -> &'??? u8,
41 | {
42 | fn call<'a>(&'a self) -> &'a u8 {
43 | (self.func)(&self.data)
44 | }
45 | }
46 |
47 | fn do_it<'b>(data: &'b (u8, u16)) -> &'b u8 { &'b data.0 }
48 |
49 | fn main() {
50 | 'x: {
51 | let clo = Closure { data: (0, 1), func: do_it };
52 | println!("{}", clo.call());
53 | }
54 | }
55 | ```
56 |
57 | How on earth are we supposed to express the lifetimes on `F`'s trait bound? We
58 | need to provide some lifetime there, but the lifetime we care about can't be
59 | named until we enter the body of `call`! Also, that isn't some fixed lifetime;
60 | `call` works with *any* lifetime `&self` happens to have at that point.
61 |
62 | This job requires The Magic of Higher-Rank Trait Bounds (HRTBs). The way we
63 | desugar this is as follows:
64 |
65 |
66 | ```rust,ignore
67 | where for<'a> F: Fn(&'a (u8, u16)) -> &'a u8,
68 | ```
69 |
70 | Alternatively:
71 |
72 |
73 | ```rust,ignore
74 | where F: for<'a> Fn(&'a (u8, u16)) -> &'a u8,
75 | ```
76 |
77 | (Where `Fn(a, b, c) -> d` is itself just sugar for the unstable *real* `Fn`
78 | trait)
79 |
80 | `for<'a>` can be read as "for all choices of `'a`", and basically produces an
81 | *infinite list* of trait bounds that F must satisfy. Intense. There aren't many
82 | places outside of the `Fn` traits where we encounter HRTBs, and even for
83 | those we have a nice magic sugar for the common cases.
84 |
85 | In summary, we can rewrite the original code more explicitly as:
86 |
87 | ```rust
88 | struct Closure {
89 | data: (u8, u16),
90 | func: F,
91 | }
92 |
93 | impl Closure
94 | where for<'a> F: Fn(&'a (u8, u16)) -> &'a u8,
95 | {
96 | fn call(&self) -> &u8 {
97 | (self.func)(&self.data)
98 | }
99 | }
100 |
101 | fn do_it(data: &(u8, u16)) -> &u8 { &data.0 }
102 |
103 | fn main() {
104 | let clo = Closure { data: (0, 1), func: do_it };
105 | println!("{}", clo.call());
106 | }
107 | ```
108 |
109 | [lt]: lifetimes.html
110 |
--------------------------------------------------------------------------------
/src/unwinding.md:
--------------------------------------------------------------------------------
1 | # Unwinding
2 |
3 | Rust has a *tiered* error-handling scheme:
4 |
5 | * If something might reasonably be absent, Option is used.
6 | * If something goes wrong and can reasonably be handled, Result is used.
7 | * If something goes wrong and cannot reasonably be handled, the thread panics.
8 | * If something catastrophic happens, the program aborts.
9 |
10 | Option and Result are overwhelmingly preferred in most situations, especially
11 | since they can be promoted into a panic or abort at the API user's discretion.
12 | Panics cause the thread to halt normal execution and unwind its stack, calling
13 | destructors as if every function instantly returned.
14 |
15 | As of 1.0, Rust is of two minds when it comes to panics. In the long-long-ago,
16 | Rust was much more like Erlang. Like Erlang, Rust had lightweight tasks,
17 | and tasks were intended to kill themselves with a panic when they reached an
18 | untenable state. Unlike an exception in Java or C++, a panic could not be
19 | caught at any time. Panics could only be caught by the owner of the task, at which
20 | point they had to be handled or *that* task would itself panic.
21 |
22 | Unwinding was important to this story because if a task's
23 | destructors weren't called, it would cause memory and other system resources to
24 | leak. Since tasks were expected to die during normal execution, this would make
25 | Rust very poor for long-running systems!
26 |
27 | As the Rust we know today came to be, this style of programming grew out of
28 | fashion in the push for less-and-less abstraction. Light-weight tasks were
29 | killed in the name of heavy-weight OS threads. Still, on stable Rust as of 1.0
30 | panics can only be caught by the parent thread. This means catching a panic
31 | requires spinning up an entire OS thread! This unfortunately stands in conflict
32 | to Rust's philosophy of zero-cost abstractions.
33 |
34 | There is an API called [`catch_unwind`] that enables catching a panic
35 | without spawning a thread. Still, we would encourage you to only do this
36 | sparingly. In particular, Rust's current unwinding implementation is heavily
37 | optimized for the "doesn't unwind" case. If a program doesn't unwind, there
38 | should be no runtime cost for the program being *ready* to unwind. As a
39 | consequence, actually unwinding will be more expensive than in e.g. Java.
40 | Don't build your programs to unwind under normal circumstances. Ideally, you
41 | should only panic for programming errors or *extreme* problems.
42 |
43 | Rust's unwinding strategy is not specified to be fundamentally compatible
44 | with any other language's unwinding. As such, unwinding into Rust from another
45 | language, or unwinding into another language from Rust is Undefined Behavior.
46 | You must *absolutely* catch any panics at the FFI boundary! What you do at that
47 | point is up to you, but *something* must be done. If you fail to do this,
48 | at best, your application will crash and burn. At worst, your application *won't*
49 | crash and burn, and will proceed with completely clobbered state.
50 |
51 | [`catch_unwind`]: https://doc.rust-lang.org/std/panic/fn.catch_unwind.html
52 |
--------------------------------------------------------------------------------
/src/transmutes.md:
--------------------------------------------------------------------------------
1 | # Transmutes
2 |
3 | Get out of our way type system! We're going to reinterpret these bits or die
4 | trying! Even though this book is all about doing things that are unsafe, I
5 | really can't emphasize enough that you should deeply think about finding Another Way
6 | than the operations covered in this section. This is really, truly, the most
7 | horribly unsafe thing you can do in Rust. The guardrails here are dental floss.
8 |
9 | [`mem::transmute`][transmute] takes a value of type `T` and reinterprets
10 | it to have type `U`. The only restriction is that the `T` and `U` are verified
11 | to have the same size. The ways to cause Undefined Behavior with this are mind
12 | boggling.
13 |
14 | * First and foremost, creating an instance of *any* type with an invalid state
15 | is going to cause arbitrary chaos that can't really be predicted. Do not
16 | transmute `3` to `bool`. Even if you never *do* anything with the `bool`. Just
17 | don't.
18 |
19 | * Transmute has an overloaded return type. If you do not specify the return type
20 | it may produce a surprising type to satisfy inference.
21 |
22 | * Transmuting an `&` to `&mut` is Undefined Behavior. While certain usages may
23 | *appear* safe, note that the Rust optimizer is free to assume that a shared
24 | reference won't change through its lifetime and thus such transmutation will
25 | run afoul of those assumptions. So:
26 | * Transmuting an `&` to `&mut` is *always* Undefined Behavior.
27 | * No you can't do it.
28 | * No you're not special.
29 |
30 | * Transmuting to a reference without an explicitly provided lifetime
31 | produces an [unbounded lifetime].
32 |
33 | * When transmuting between different compound types, you have to make sure they
34 | are laid out the same way! If layouts differ, the wrong fields are going to
35 | get filled with the wrong data, which will make you unhappy and can also be
36 | Undefined Behavior (see above).
37 |
38 | So how do you know if the layouts are the same? For `repr(C)` types and
39 | `repr(transparent)` types, layout is precisely defined. But for your
40 | run-of-the-mill `repr(Rust)`, it is not. Even different instances of the same
41 | generic type can have wildly different layout. `Vec` and `Vec`
42 | *might* have their fields in the same order, or they might not. The details of
43 | what exactly is and is not guaranteed for data layout are still being worked
44 | out over [at the UCG WG][ucg-layout].
45 |
46 | [`mem::transmute_copy`][transmute_copy] somehow manages to be *even more*
47 | wildly unsafe than this. It copies `size_of` bytes out of an `&T` and
48 | interprets them as a `U`. The size check that `mem::transmute` has is gone (as
49 | it may be valid to copy out a prefix), though it is Undefined Behavior for `U`
50 | to be larger than `T`.
51 |
52 | Also of course you can get all of the functionality of these functions using raw
53 | pointer casts or `union`s, but without any of the lints or other basic sanity
54 | checks. Raw pointer casts and `union`s do not magically avoid the above rules.
55 |
56 | [unbounded lifetime]: ./unbounded-lifetimes.md
57 | [transmute]: ../std/mem/fn.transmute.html
58 | [transmute_copy]: ../std/mem/fn.transmute_copy.html
59 | [ucg-layout]: https://rust-lang.github.io/unsafe-code-guidelines/layout.html
60 |
--------------------------------------------------------------------------------
/src/beneath-std.md:
--------------------------------------------------------------------------------
1 | # Beneath `std`
2 |
3 | This section documents features that are normally provided by the `std` crate and
4 | that `#![no_std]` developers have to deal with (i.e. provide) to build
5 | `#![no_std]` binary crates.
6 |
7 | ## Using `libc`
8 |
9 | In order to build a `#[no_std]` executable we will need `libc` as a dependency.
10 | We can specify this using our `Cargo.toml` file:
11 |
12 | ```toml
13 | [dependencies]
14 | libc = { version = "0.2.146", default-features = false }
15 | ```
16 |
17 | Note that the default features have been disabled. This is a critical step -
18 | **the default features of `libc` include the `std` crate and so must be
19 | disabled.**
20 |
21 | Alternatively, we can use the unstable `rustc_private` private feature together
22 | with an `extern crate libc;` declaration as shown in the examples below. Note that
23 | windows-msvc targets do not require a libc, and correspondingly there is no `libc`
24 | crate in their sysroot. We do not need the `extern crate libc;` below, and having it
25 | on a windows-msvc target would be a compile error.
26 |
27 | ## Writing an executable without `std`
28 |
29 | We will probably need a nightly version of the compiler to produce
30 | a `#![no_std]` executable because on many platforms, we have to provide the
31 | `eh_personality` [lang item], which is unstable.
32 |
33 | You will need to define a symbol for the entry point that is suitable for your target. For example, `main`, `_start`, `WinMain`, or whatever starting point is relevant for your target.
34 | Additionally, you need to use the `#![no_main]` attribute to prevent the compiler from attempting to generate an entry point itself.
35 |
36 | Additionally, it's required to define a [panic handler function](panic-handler.html).
37 |
38 | ```rust
39 | #![feature(lang_items, core_intrinsics, rustc_private)]
40 | #![allow(internal_features)]
41 | #![no_std]
42 | #![no_main]
43 |
44 | // Necessary for `panic = "unwind"` builds on cfg(unix) platforms.
45 | #![feature(panic_unwind)]
46 | extern crate unwind;
47 |
48 | // Pull in the system libc library for what crt0.o likely requires.
49 | #[cfg(not(windows))]
50 | extern crate libc;
51 |
52 | use core::ffi::{c_char, c_int};
53 | use core::panic::PanicInfo;
54 |
55 | // Entry point for this program.
56 | #[unsafe(no_mangle)] // ensure that this symbol is included in the output as `main`
57 | extern "C" fn main(_argc: c_int, _argv: *const *const c_char) -> c_int {
58 | 0
59 | }
60 |
61 | // These functions are used by the compiler, but not for an empty program like this.
62 | // They are normally provided by `std`.
63 | #[lang = "eh_personality"]
64 | fn rust_eh_personality() {}
65 | #[panic_handler]
66 | fn panic_handler(_info: &PanicInfo) -> ! { core::intrinsics::abort() }
67 | ```
68 |
69 | If you are working with a target that doesn't have binary releases of the
70 | standard library available via rustup (this probably means you are building the
71 | `core` crate yourself) and need compiler-rt intrinsics (i.e. you are probably
72 | getting linker errors when building an executable:
73 | ``undefined reference to `__aeabi_memcpy'``), you need to manually link to the
74 | [`compiler_builtins` crate] to get those intrinsics and solve the linker errors.
75 |
76 | [`compiler_builtins` crate]: https://crates.io/crates/compiler_builtins
77 | [lang item]: https://doc.rust-lang.org/nightly/unstable-book/language-features/lang-items.html
78 |
--------------------------------------------------------------------------------
/src/intro.md:
--------------------------------------------------------------------------------
1 | # The Rustonomicon
2 |
3 |
4 |
5 | Warning:
6 | This book is incomplete.
7 | Documenting everything and rewriting outdated parts take a while.
8 | See the [issue tracker] to check what's missing/outdated, and if there are any mistakes or ideas that haven't been reported, feel free to open a new issue there.
9 |
10 |
11 |
12 | [issue tracker]: https://github.com/rust-lang/nomicon/issues
13 |
14 | ## The Dark Arts of Unsafe Rust
15 |
16 | > THE KNOWLEDGE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF UNLEASHING INDESCRIBABLE HORRORS THAT SHATTER YOUR PSYCHE AND SET YOUR MIND ADRIFT IN THE UNKNOWABLY INFINITE COSMOS.
17 |
18 | The Rustonomicon digs into all the awful details that you need to understand when writing Unsafe Rust programs.
19 |
20 | Should you wish a long and happy career of writing Rust programs, you should turn back now and forget you ever saw this book.
21 | It is not necessary.
22 | However if you intend to write unsafe code — or just want to dig into the guts of the language — this book contains lots of useful information.
23 |
24 | Unlike *[The Rust Programming Language][trpl]*, we will be assuming considerable prior knowledge.
25 | In particular, you should be comfortable with basic systems programming and Rust.
26 | If you don't feel comfortable with these topics, you should consider reading [The Book][trpl] first.
27 | That said, we won't assume you have read it, and we will take care to occasionally give a refresher on the basics where appropriate.
28 | You can skip straight to this book if you want; just know that we won't be explaining everything from the ground up.
29 |
30 | This book exists primarily as a high-level companion to [The Reference][ref].
31 | Where The Reference exists to detail the syntax and semantics of every part of the language, The Rustonomicon exists to describe how to use those pieces together, and the issues that you will have in doing so.
32 |
33 | The Reference will tell you the syntax and semantics of references, destructors, and unwinding, but it won't tell you how combining them can lead to exception-safety issues, or how to deal with those issues.
34 |
35 | It should be noted that we haven't synced The Rustnomicon and The Reference well, so they may have duplicate content.
36 | In general, if the two documents disagree, The Reference should be assumed to be correct (it isn't yet considered normative, it's just better maintained).
37 |
38 | Topics that are within the scope of this book include: the meaning of (un)safety, unsafe primitives provided by the language and standard library, techniques for creating safe abstractions with those unsafe primitives, subtyping and variance, exception-safety (panic/unwind-safety), working with uninitialized memory, type punning, concurrency, interoperating with other languages (FFI), optimization tricks, how constructs lower to compiler/OS/hardware primitives, how to **not** make the memory model people angry, how you're **going** to make the memory model people angry, and more.
39 |
40 | The Rustonomicon is not a place to exhaustively describe the semantics and guarantees of every single API in the standard library, nor is it a place to exhaustively describe every feature of Rust.
41 |
42 | Unless otherwise noted, Rust code in this book uses the Rust 2024 edition.
43 |
44 | [trpl]: ../book/index.html
45 | [ref]: ../reference/index.html
46 |
--------------------------------------------------------------------------------
/src/checked-uninit.md:
--------------------------------------------------------------------------------
1 | # Checked Uninitialized Memory
2 |
3 | Like C, all stack variables in Rust are uninitialized until a value is
4 | explicitly assigned to them. Unlike C, Rust statically prevents you from ever
5 | reading them until you do:
6 |
7 | ```rust,compile_fail
8 | fn main() {
9 | let x: i32;
10 | println!("{}", x);
11 | }
12 | ```
13 |
14 | ```text
15 | |
16 | 3 | println!("{}", x);
17 | | ^ use of possibly uninitialized `x`
18 | ```
19 |
20 | This is based off of a basic branch analysis: every branch must assign a value
21 | to `x` before it is first used. For short, we also say that "`x` is init" or
22 | "`x` is uninit".
23 |
24 | Interestingly, Rust doesn't require the variable
25 | to be mutable to perform a delayed initialization if every branch assigns
26 | exactly once. However the analysis does not take advantage of constant analysis
27 | or anything like that. So this compiles:
28 |
29 | ```rust
30 | fn main() {
31 | let x: i32;
32 |
33 | if true {
34 | x = 1;
35 | } else {
36 | x = 2;
37 | }
38 |
39 | println!("{}", x);
40 | }
41 | ```
42 |
43 | but this doesn't:
44 |
45 | ```rust,compile_fail
46 | fn main() {
47 | let x: i32;
48 | if true {
49 | x = 1;
50 | }
51 | println!("{}", x);
52 | }
53 | ```
54 |
55 | ```text
56 | |
57 | 6 | println!("{}", x);
58 | | ^ use of possibly uninitialized `x`
59 | ```
60 |
61 | while this does:
62 |
63 | ```rust
64 | fn main() {
65 | let x: i32;
66 | if true {
67 | x = 1;
68 | println!("{}", x);
69 | }
70 | // Don't care that there are branches where it's not initialized
71 | // since we don't use the value in those branches
72 | }
73 | ```
74 |
75 | Of course, while the analysis doesn't consider actual values, it does
76 | have a relatively sophisticated understanding of dependencies and control
77 | flow. For instance, this works:
78 |
79 | ```rust
80 | let x: i32;
81 |
82 | loop {
83 | // Rust doesn't understand that this branch will be taken unconditionally,
84 | // because it relies on actual values.
85 | if true {
86 | // But it does understand that it will only be taken once because
87 | // we unconditionally break out of it. Therefore `x` doesn't
88 | // need to be marked as mutable.
89 | x = 0;
90 | break;
91 | }
92 | }
93 | // It also knows that it's impossible to get here without reaching the break.
94 | // And therefore that `x` must be initialized here!
95 | println!("{}", x);
96 | ```
97 |
98 | If a value is moved out of a variable, that variable becomes logically
99 | uninitialized if the type of the value isn't Copy. That is:
100 |
101 | ```rust
102 | fn main() {
103 | let x = 0;
104 | let y = Box::new(0);
105 | let z1 = x; // x is still valid because i32 is Copy
106 | let z2 = y; // y is now logically uninitialized because Box isn't Copy
107 | }
108 | ```
109 |
110 | However reassigning `y` in this example *would* require `y` to be marked as
111 | mutable, as a Safe Rust program could observe that the value of `y` changed:
112 |
113 | ```rust
114 | fn main() {
115 | let mut y = Box::new(0);
116 | let z = y; // y is now logically uninitialized because Box isn't Copy
117 | y = Box::new(1); // reinitialize y
118 | }
119 | ```
120 |
121 | Otherwise it's like `y` is a brand new variable.
122 |
--------------------------------------------------------------------------------
/src/drop-flags.md:
--------------------------------------------------------------------------------
1 | # Drop Flags
2 |
3 | The examples in the previous section introduce an interesting problem for Rust.
4 | We have seen that it's possible to conditionally initialize, deinitialize, and
5 | reinitialize locations of memory totally safely. For Copy types, this isn't
6 | particularly notable since they're just a random pile of bits. However types
7 | with destructors are a different story: Rust needs to know whether to call a
8 | destructor whenever a variable is assigned to, or a variable goes out of scope.
9 | How can it do this with conditional initialization?
10 |
11 | Note that this is not a problem that all assignments need worry about. In
12 | particular, assigning through a dereference unconditionally drops, and assigning
13 | in a `let` unconditionally doesn't drop:
14 |
15 | ```rust
16 | let mut x = Box::new(0); // let makes a fresh variable, so never need to drop
17 | let y = &mut x;
18 | *y = Box::new(1); // Deref assumes the referent is initialized, so always drops
19 | ```
20 |
21 | This is only a problem when overwriting a previously initialized variable or
22 | one of its subfields.
23 |
24 | It turns out that Rust actually tracks whether a type should be dropped or not
25 | *at runtime*. As a variable becomes initialized and uninitialized, a *drop flag*
26 | for that variable is toggled. When a variable might need to be dropped, this
27 | flag is evaluated to determine if it should be dropped.
28 |
29 | Of course, it is often the case that a value's initialization state can be
30 | statically known at every point in the program. If this is the case, then the
31 | compiler can theoretically generate more efficient code! For instance, straight-
32 | line code has such *static drop semantics*:
33 |
34 | ```rust
35 | let mut x = Box::new(0); // x was uninit; just overwrite.
36 | let mut y = x; // y was uninit; just overwrite and make x uninit.
37 | x = Box::new(0); // x was uninit; just overwrite.
38 | y = x; // y was init; Drop y, overwrite it, and make x uninit!
39 | // y goes out of scope; y was init; Drop y!
40 | // x goes out of scope; x was uninit; do nothing.
41 | ```
42 |
43 | Similarly, branched code where all branches have the same behavior with respect
44 | to initialization has static drop semantics:
45 |
46 | ```rust
47 | # let condition = true;
48 | let mut x = Box::new(0); // x was uninit; just overwrite.
49 | if condition {
50 | drop(x) // x gets moved out; make x uninit.
51 | } else {
52 | println!("{}", x);
53 | drop(x) // x gets moved out; make x uninit.
54 | }
55 | x = Box::new(0); // x was uninit; just overwrite.
56 | // x goes out of scope; x was init; Drop x!
57 | ```
58 |
59 | However code like this *requires* runtime information to correctly Drop:
60 |
61 | ```rust
62 | # let condition = true;
63 | let x;
64 | if condition {
65 | x = Box::new(0); // x was uninit; just overwrite.
66 | println!("{}", x);
67 | }
68 | // x goes out of scope; x might be uninit;
69 | // check the flag!
70 | ```
71 |
72 | Of course, in this case it's trivial to retrieve static drop semantics:
73 |
74 | ```rust
75 | # let condition = true;
76 | if condition {
77 | let x = Box::new(0);
78 | println!("{}", x);
79 | }
80 | ```
81 |
82 | The drop flags are tracked on the stack.
83 | In old Rust versions, drop flags were stashed in a hidden field of types that implement `Drop`.
84 |
--------------------------------------------------------------------------------
/src/meet-safe-and-unsafe.md:
--------------------------------------------------------------------------------
1 | # Meet Safe and Unsafe
2 |
3 | 
4 |
5 | It would be great to not have to worry about low-level implementation details.
6 | Who could possibly care how much space the empty tuple occupies? Sadly, it
7 | sometimes matters and we need to worry about it. The most common reason
8 | developers start to care about implementation details is performance, but more
9 | importantly, these details can become a matter of correctness when interfacing
10 | directly with hardware, operating systems, or other languages.
11 |
12 | When implementation details start to matter in a safe programming language,
13 | programmers usually have three options:
14 |
15 | * fiddle with the code to encourage the compiler/runtime to perform an optimization
16 | * adopt a more unidiomatic or cumbersome design to get the desired implementation
17 | * rewrite the implementation in a language that lets you deal with those details
18 |
19 | For that last option, the language programmers tend to use is *C*. This is often
20 | necessary to interface with systems that only declare a C interface.
21 |
22 | Unfortunately, C is incredibly unsafe to use (sometimes for good reason),
23 | and this unsafety is magnified when trying to interoperate with another
24 | language. Care must be taken to ensure C and the other language agree on
25 | what's happening, and that they don't step on each other's toes.
26 |
27 | So what does this have to do with Rust?
28 |
29 | Well, unlike C, Rust is a safe programming language.
30 |
31 | But, like C, Rust is an unsafe programming language.
32 |
33 | More accurately, Rust *contains* both a safe and unsafe programming language.
34 |
35 | Rust can be thought of as a combination of two programming languages: *Safe
36 | Rust* and *Unsafe Rust*. Conveniently, these names mean exactly what they say:
37 | Safe Rust is Safe. Unsafe Rust is, well, not. In fact, Unsafe Rust lets us
38 | do some *really* unsafe things. Things the Rust authors will implore you not to
39 | do, but we'll do anyway.
40 |
41 | Safe Rust is the *true* Rust programming language. If all you do is write Safe
42 | Rust, you will never have to worry about type-safety or memory-safety. You will
43 | never endure a dangling pointer, a use-after-free, or any other kind of
44 | Undefined Behavior (a.k.a. UB).
45 |
46 | The standard library also gives you enough utilities out of the box that you'll
47 | be able to write high-performance applications and libraries in pure idiomatic
48 | Safe Rust.
49 |
50 | But maybe you want to talk to another language. Maybe you're writing a
51 | low-level abstraction not exposed by the standard library. Maybe you're
52 | *writing* the standard library (which is written entirely in Rust). Maybe you
53 | need to do something the type-system doesn't understand and just *frob some dang
54 | bits*. Maybe you need Unsafe Rust.
55 |
56 | Unsafe Rust is exactly like Safe Rust with all the same rules and semantics.
57 | It just lets you do some *extra* things that are Definitely Not Safe
58 | (which we will define in the next section).
59 |
60 | The value of this separation is that we gain the benefits of using an unsafe
61 | language like C — low level control over implementation details — without most
62 | of the problems that come with trying to integrate it with a completely
63 | different safe language.
64 |
65 | There are still some problems — most notably, we must become aware of properties
66 | that the type system assumes and audit them in any code that interacts with
67 | Unsafe Rust. That's the purpose of this book: to teach you about these assumptions
68 | and how to manage them.
69 |
--------------------------------------------------------------------------------
/src/arc-mutex/arc-clone.md:
--------------------------------------------------------------------------------
1 | # Cloning
2 |
3 | Now that we've got some basic code set up, we'll need a way to clone the `Arc`.
4 |
5 | Basically, we need to:
6 |
7 | 1. Increment the atomic reference count
8 | 2. Construct a new instance of the `Arc` from the inner pointer
9 |
10 | First, we need to get access to the `ArcInner`:
11 |
12 |
13 | ```rust,ignore
14 | let inner = unsafe { self.ptr.as_ref() };
15 | ```
16 |
17 | We can update the atomic reference count as follows:
18 |
19 |
20 | ```rust,ignore
21 | let old_rc = inner.rc.fetch_add(1, Ordering::???);
22 | ```
23 |
24 | But what ordering should we use here? We don't really have any code that will
25 | need atomic synchronization when cloning, as we do not modify the internal value
26 | while cloning. Thus, we can use a Relaxed ordering here, which implies no
27 | happens-before relationship but is atomic. When `Drop`ping the Arc, however,
28 | we'll need to atomically synchronize when decrementing the reference count. This
29 | is described more in [the section on the `Drop` implementation for
30 | `Arc`](arc-drop.md). For more information on atomic relationships and Relaxed
31 | ordering, see [the section on atomics](../atomics.md).
32 |
33 | Thus, the code becomes this:
34 |
35 |
36 | ```rust,ignore
37 | let old_rc = inner.rc.fetch_add(1, Ordering::Relaxed);
38 | ```
39 |
40 | We'll need to add another import to use `Ordering`:
41 |
42 | ```rust
43 | use std::sync::atomic::Ordering;
44 | ```
45 |
46 | However, we have one problem with this implementation right now. What if someone
47 | decides to `mem::forget` a bunch of Arcs? The code we have written so far (and
48 | will write) assumes that the reference count accurately portrays how many Arcs
49 | are in memory, but with `mem::forget` this is false. Thus, when more and more
50 | Arcs are cloned from this one without them being `Drop`ped and the reference
51 | count being decremented, we can overflow! This will cause use-after-free which
52 | is **INCREDIBLY BAD!**
53 |
54 | To handle this, we need to check that the reference count does not go over some
55 | arbitrary value (below `usize::MAX`, as we're storing the reference count as an
56 | `AtomicUsize`), and do *something*.
57 |
58 | The standard library's implementation decides to just abort the program (as it
59 | is an incredibly unlikely case in normal code and if it happens, the program is
60 | probably incredibly degenerate) if the reference count reaches `isize::MAX`
61 | (about half of `usize::MAX`) on any thread, on the assumption that there are
62 | probably not about 2 billion threads (or about **9 quintillion** on some 64-bit
63 | machines) incrementing the reference count at once. This is what we'll do.
64 |
65 | It's pretty simple to implement this behavior:
66 |
67 |
68 | ```rust,ignore
69 | if old_rc >= isize::MAX as usize {
70 | std::process::abort();
71 | }
72 | ```
73 |
74 | Then, we need to return a new instance of the `Arc`:
75 |
76 |
77 | ```rust,ignore
78 | Self {
79 | ptr: self.ptr,
80 | phantom: PhantomData
81 | }
82 | ```
83 |
84 | Now, let's wrap this all up inside the `Clone` implementation:
85 |
86 |
87 | ```rust,ignore
88 | use std::sync::atomic::Ordering;
89 |
90 | impl Clone for Arc {
91 | fn clone(&self) -> Arc {
92 | let inner = unsafe { self.ptr.as_ref() };
93 | // Using a relaxed ordering is alright here as we don't need any atomic
94 | // synchronization here as we're not modifying or accessing the inner
95 | // data.
96 | let old_rc = inner.rc.fetch_add(1, Ordering::Relaxed);
97 |
98 | if old_rc >= isize::MAX as usize {
99 | std::process::abort();
100 | }
101 |
102 | Self {
103 | ptr: self.ptr,
104 | phantom: PhantomData,
105 | }
106 | }
107 | }
108 | ```
109 |
--------------------------------------------------------------------------------
/src/lifetime-mismatch.md:
--------------------------------------------------------------------------------
1 | # Limits of Lifetimes
2 |
3 | Given the following code:
4 |
5 | ```rust,compile_fail
6 | #[derive(Debug)]
7 | struct Foo;
8 |
9 | impl Foo {
10 | fn mutate_and_share(&mut self) -> &Self { &*self }
11 | fn share(&self) {}
12 | }
13 |
14 | fn main() {
15 | let mut foo = Foo;
16 | let loan = foo.mutate_and_share();
17 | foo.share();
18 | println!("{:?}", loan);
19 | }
20 | ```
21 |
22 | One might expect it to compile. We call `mutate_and_share`, which mutably
23 | borrows `foo` temporarily, but then returns only a shared reference. Therefore
24 | we would expect `foo.share()` to succeed as `foo` shouldn't be mutably borrowed.
25 |
26 | However when we try to compile it:
27 |
28 | ```text
29 | error[E0502]: cannot borrow `foo` as immutable because it is also borrowed as mutable
30 | --> src/main.rs:12:5
31 | |
32 | 11 | let loan = foo.mutate_and_share();
33 | | --- mutable borrow occurs here
34 | 12 | foo.share();
35 | | ^^^ immutable borrow occurs here
36 | 13 | println!("{:?}", loan);
37 | ```
38 |
39 | What happened? Well, we got the exact same reasoning as we did for
40 | [Example 2 in the previous section][ex2]. We desugar the program and we get
41 | the following:
42 |
43 |
44 | ```rust,ignore
45 | struct Foo;
46 |
47 | impl Foo {
48 | fn mutate_and_share<'a>(&'a mut self) -> &'a Self { &'a *self }
49 | fn share<'a>(&'a self) {}
50 | }
51 |
52 | fn main() {
53 | 'b: {
54 | let mut foo: Foo = Foo;
55 | 'c: {
56 | let loan: &'c Foo = Foo::mutate_and_share::<'c>(&'c mut foo);
57 | 'd: {
58 | Foo::share::<'d>(&'d foo);
59 | }
60 | println!("{:?}", loan);
61 | }
62 | }
63 | }
64 | ```
65 |
66 | The lifetime system is forced to extend the `&mut foo` to have lifetime `'c`,
67 | due to the lifetime of `loan` and `mutate_and_share`'s signature. Then when we
68 | try to call `share`, it sees we're trying to alias that `&'c mut foo` and
69 | blows up in our face!
70 |
71 | This program is clearly correct according to the reference semantics we actually
72 | care about, but the lifetime system is too coarse-grained to handle that.
73 |
74 | ## Improperly reduced borrows
75 |
76 | The following code fails to compile, because Rust sees that a variable, `map`,
77 | is borrowed twice, and can not infer that the first borrow ceases to be needed
78 | before the second one occurs. This is caused by Rust conservatively falling back
79 | to using a whole scope for the first borrow. This will eventually get fixed.
80 |
81 | ```rust,compile_fail
82 | # use std::collections::HashMap;
83 | # use std::hash::Hash;
84 | fn get_default<'m, K, V>(map: &'m mut HashMap, key: K) -> &'m mut V
85 | where
86 | K: Clone + Eq + Hash,
87 | V: Default,
88 | {
89 | match map.get_mut(&key) {
90 | Some(value) => value,
91 | None => {
92 | map.insert(key.clone(), V::default());
93 | map.get_mut(&key).unwrap()
94 | }
95 | }
96 | }
97 | ```
98 |
99 | Because of the lifetime restrictions imposed, `&mut map`'s lifetime
100 | overlaps other mutable borrows, resulting in a compile error:
101 |
102 | ```text
103 | error[E0499]: cannot borrow `*map` as mutable more than once at a time
104 | --> src/main.rs:12:13
105 | |
106 | 4 | fn get_default<'m, K, V>(map: &'m mut HashMap, key: K) -> &'m mut V
107 | | -- lifetime `'m` defined here
108 | ...
109 | 9 | match map.get_mut(&key) {
110 | | - --- first mutable borrow occurs here
111 | | _____|
112 | | |
113 | 10 | | Some(value) => value,
114 | 11 | | None => {
115 | 12 | | map.insert(key.clone(), V::default());
116 | | | ^^^ second mutable borrow occurs here
117 | 13 | | map.get_mut(&key).unwrap()
118 | 14 | | }
119 | 15 | | }
120 | | |_____- returning this value requires that `*map` is borrowed for `'m`
121 | ```
122 |
123 | [ex2]: lifetimes.html#example-aliasing-a-mutable-reference
124 |
--------------------------------------------------------------------------------
/src/races.md:
--------------------------------------------------------------------------------
1 | # Data Races and Race Conditions
2 |
3 | Safe Rust guarantees an absence of data races, which are defined as:
4 |
5 | * two or more threads concurrently accessing a location of memory
6 | * one or more of them is a write
7 | * one or more of them is unsynchronized
8 |
9 | A data race has Undefined Behavior, and is therefore impossible to perform in
10 | Safe Rust. Data races are prevented *mostly* through Rust's ownership system alone:
11 | it's impossible to alias a mutable reference, so it's impossible to perform a
12 | data race. Interior mutability makes this more complicated, which is largely why
13 | we have the Send and Sync traits (see the next section for more on this).
14 |
15 | **However Rust does not prevent general race conditions.**
16 |
17 | This is mathematically impossible in situations where you do not control the
18 | scheduler, which is true for the normal OS environment. If you do control
19 | preemption, it _can be_ possible to prevent general races - this technique is
20 | used by frameworks such as [RTIC](https://github.com/rtic-rs/rtic). However,
21 | actually having control over scheduling is a very uncommon case.
22 |
23 | For this reason, it is considered "safe" for Rust to get deadlocked or do
24 | something nonsensical with incorrect synchronization: this is known as a general
25 | race condition or resource race. Obviously such a program isn't very good, but
26 | Rust of course cannot prevent all logic errors.
27 |
28 | In any case, a race condition cannot violate memory safety in a Rust program on
29 | its own. Only in conjunction with some other unsafe code can a race condition
30 | actually violate memory safety. For instance, a correct program looks like this:
31 |
32 | ```rust,no_run
33 | use std::thread;
34 | use std::sync::atomic::{AtomicUsize, Ordering};
35 | use std::sync::Arc;
36 |
37 | let data = vec![1, 2, 3, 4];
38 | // Arc so that the memory the AtomicUsize is stored in still exists for
39 | // the other thread to increment, even if we completely finish executing
40 | // before it. Rust won't compile the program without it, because of the
41 | // lifetime requirements of thread::spawn!
42 | let idx = Arc::new(AtomicUsize::new(0));
43 | let other_idx = idx.clone();
44 |
45 | // `move` captures other_idx by-value, moving it into this thread
46 | thread::spawn(move || {
47 | // It's ok to mutate idx because this value
48 | // is an atomic, so it can't cause a Data Race.
49 | other_idx.fetch_add(10, Ordering::SeqCst);
50 | });
51 |
52 | // Index with the value loaded from the atomic. This is safe because we
53 | // read the atomic memory only once, and then pass a copy of that value
54 | // to the Vec's indexing implementation. This indexing will be correctly
55 | // bounds checked, and there's no chance of the value getting changed
56 | // in the middle. However our program may panic if the thread we spawned
57 | // managed to increment before this ran. A race condition because correct
58 | // program execution (panicking is rarely correct) depends on order of
59 | // thread execution.
60 | println!("{}", data[idx.load(Ordering::SeqCst)]);
61 | ```
62 |
63 | We can cause a race condition to violate memory safety if we instead do the bound
64 | check in advance, and then unsafely access the data with an unchecked value:
65 |
66 | ```rust,no_run
67 | use std::thread;
68 | use std::sync::atomic::{AtomicUsize, Ordering};
69 | use std::sync::Arc;
70 |
71 | let data = vec![1, 2, 3, 4];
72 |
73 | let idx = Arc::new(AtomicUsize::new(0));
74 | let other_idx = idx.clone();
75 |
76 | // `move` captures other_idx by-value, moving it into this thread
77 | thread::spawn(move || {
78 | // It's ok to mutate idx because this value
79 | // is an atomic, so it can't cause a Data Race.
80 | other_idx.fetch_add(10, Ordering::SeqCst);
81 | });
82 |
83 | if idx.load(Ordering::SeqCst) < data.len() {
84 | unsafe {
85 | // Incorrectly loading the idx after we did the bounds check.
86 | // It could have changed. This is a race condition, *and dangerous*
87 | // because we decided to do `get_unchecked`, which is `unsafe`.
88 | println!("{}", data.get_unchecked(idx.load(Ordering::SeqCst)));
89 | }
90 | }
91 | ```
92 |
--------------------------------------------------------------------------------
/src/arc-mutex/arc-drop.md:
--------------------------------------------------------------------------------
1 | # Dropping
2 |
3 | We now need a way to decrease the reference count and drop the data once it is
4 | low enough, otherwise the data will live forever on the heap.
5 |
6 | To do this, we can implement `Drop`.
7 |
8 | Basically, we need to:
9 |
10 | 1. Decrement the reference count
11 | 2. If there is only one reference remaining to the data, then:
12 | 3. Atomically fence the data to prevent reordering of the use and deletion of
13 | the data
14 | 4. Drop the inner data
15 |
16 | First, we'll need to get access to the `ArcInner`:
17 |
18 |
19 | ```rust,ignore
20 | let inner = unsafe { self.ptr.as_ref() };
21 | ```
22 |
23 | Now, we need to decrement the reference count. To streamline our code, we can
24 | also return if the returned value from `fetch_sub` (the value of the reference
25 | count before decrementing it) is not equal to `1` (which happens when we are not
26 | the last reference to the data).
27 |
28 |
29 | ```rust,ignore
30 | if inner.rc.fetch_sub(1, Ordering::Release) != 1 {
31 | return;
32 | }
33 | ```
34 |
35 | We then need to create an atomic fence to prevent reordering of the use of the
36 | data and deletion of the data. As described in [the standard library's
37 | implementation of `Arc`][3]:
38 | > This fence is needed to prevent reordering of use of the data and deletion of
39 | > the data. Because it is marked `Release`, the decreasing of the reference
40 | > count synchronizes with this `Acquire` fence. This means that use of the data
41 | > happens before decreasing the reference count, which happens before this
42 | > fence, which happens before the deletion of the data.
43 | >
44 | > As explained in the [Boost documentation][1],
45 | >
46 | > > It is important to enforce any possible access to the object in one
47 | > > thread (through an existing reference) to *happen before* deleting
48 | > > the object in a different thread. This is achieved by a "release"
49 | > > operation after dropping a reference (any access to the object
50 | > > through this reference must obviously happened before), and an
51 | > > "acquire" operation before deleting the object.
52 | >
53 | > In particular, while the contents of an Arc are usually immutable, it's
54 | > possible to have interior writes to something like a `Mutex`. Since a Mutex
55 | > is not acquired when it is deleted, we can't rely on its synchronization logic
56 | > to make writes in thread A visible to a destructor running in thread B.
57 | >
58 | > Also note that the Acquire fence here could probably be replaced with an
59 | > Acquire load, which could improve performance in highly-contended situations.
60 | > See [2].
61 | >
62 | > [1]: https://www.boost.org/doc/libs/1_55_0/doc/html/atomic/usage_examples.html
63 | > [2]: https://github.com/rust-lang/rust/pull/41714
64 | [3]: https://github.com/rust-lang/rust/blob/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/alloc/src/sync.rs#L1440-L1467
65 |
66 | To do this, we do the following:
67 |
68 | ```rust
69 | # use std::sync::atomic::Ordering;
70 | use std::sync::atomic;
71 | atomic::fence(Ordering::Acquire);
72 | ```
73 |
74 | Finally, we can drop the data itself. We use `Box::from_raw` to drop the boxed
75 | `ArcInner` and its data. This takes a `*mut T` and not a `NonNull`, so we
76 | must convert using `NonNull::as_ptr`.
77 |
78 |
79 | ```rust,ignore
80 | unsafe { Box::from_raw(self.ptr.as_ptr()); }
81 | ```
82 |
83 | This is safe as we know we have the last pointer to the `ArcInner` and that its
84 | pointer is valid.
85 |
86 | Now, let's wrap this all up inside the `Drop` implementation:
87 |
88 |
89 | ```rust,ignore
90 | impl Drop for Arc {
91 | fn drop(&mut self) {
92 | let inner = unsafe { self.ptr.as_ref() };
93 | if inner.rc.fetch_sub(1, Ordering::Release) != 1 {
94 | return;
95 | }
96 | // This fence is needed to prevent reordering of the use and deletion
97 | // of the data.
98 | atomic::fence(Ordering::Acquire);
99 | // This is safe as we know we have the last pointer to the `ArcInner`
100 | // and that its pointer is valid.
101 | unsafe { Box::from_raw(self.ptr.as_ptr()); }
102 | }
103 | }
104 | ```
105 |
--------------------------------------------------------------------------------
/src/working-with-unsafe.md:
--------------------------------------------------------------------------------
1 | # Working with Unsafe
2 |
3 | Rust generally only gives us the tools to talk about Unsafe Rust in a scoped and
4 | binary manner. Unfortunately, reality is significantly more complicated than
5 | that. For instance, consider the following toy function:
6 |
7 | ```rust
8 | fn index(idx: usize, arr: &[u8]) -> Option {
9 | if idx < arr.len() {
10 | unsafe {
11 | Some(*arr.get_unchecked(idx))
12 | }
13 | } else {
14 | None
15 | }
16 | }
17 | ```
18 |
19 | This function is safe and correct. We check that the index is in bounds, and if
20 | it is, index into the array in an unchecked manner. We say that such a correct
21 | unsafely implemented function is *sound*, meaning that safe code cannot cause
22 | Undefined Behavior through it (which, remember, is the single fundamental
23 | property of Safe Rust).
24 |
25 | But even in such a trivial function, the scope of the unsafe block is
26 | questionable. Consider changing the `<` to a `<=`:
27 |
28 | ```rust
29 | fn index(idx: usize, arr: &[u8]) -> Option {
30 | if idx <= arr.len() {
31 | unsafe {
32 | Some(*arr.get_unchecked(idx))
33 | }
34 | } else {
35 | None
36 | }
37 | }
38 | ```
39 |
40 | This program is now *unsound*, Safe Rust can cause Undefined Behavior, and yet
41 | *we only modified safe code*. This is the fundamental problem of safety: it's
42 | non-local. The soundness of our unsafe operations necessarily depends on the
43 | state established by otherwise "safe" operations.
44 |
45 | Safety is modular in the sense that opting into unsafety doesn't require you
46 | to consider arbitrary other kinds of badness. For instance, doing an unchecked
47 | index into a slice doesn't mean you suddenly need to worry about the slice being
48 | null or containing uninitialized memory. Nothing fundamentally changes. However
49 | safety *isn't* modular in the sense that programs are inherently stateful and
50 | your unsafe operations may depend on arbitrary other state.
51 |
52 | This non-locality gets much worse when we incorporate actual persistent state.
53 | Consider a simple implementation of `Vec`:
54 |
55 | ```rust
56 | use std::ptr;
57 |
58 | // Note: This definition is naive. See the chapter on implementing Vec.
59 | pub struct Vec {
60 | ptr: *mut T,
61 | len: usize,
62 | cap: usize,
63 | }
64 |
65 | // Note this implementation does not correctly handle zero-sized types.
66 | // See the chapter on implementing Vec.
67 | impl Vec {
68 | pub fn push(&mut self, elem: T) {
69 | if self.len == self.cap {
70 | // not important for this example
71 | self.reallocate();
72 | }
73 | unsafe {
74 | ptr::write(self.ptr.add(self.len), elem);
75 | self.len += 1;
76 | }
77 | }
78 | # fn reallocate(&mut self) { }
79 | }
80 |
81 | # fn main() {}
82 | ```
83 |
84 | This code is simple enough to reasonably audit and informally verify. Now consider
85 | adding the following method:
86 |
87 |
88 | ```rust,ignore
89 | fn make_room(&mut self) {
90 | // grow the capacity
91 | self.cap += 1;
92 | }
93 | ```
94 |
95 | This code is 100% Safe Rust but it is also completely unsound. Changing the
96 | capacity violates the invariants of Vec (that `cap` reflects the allocated space
97 | in the Vec). This is not something the rest of Vec can guard against. It *has*
98 | to trust the capacity field because there's no way to verify it.
99 |
100 | Because it relies on invariants of a struct field, this `unsafe` code
101 | does more than pollute a whole function: it pollutes a whole *module*.
102 | Generally, the only bullet-proof way to limit the scope of unsafe code is at the
103 | module boundary with privacy.
104 |
105 | However this works *perfectly*. The existence of `make_room` is *not* a
106 | problem for the soundness of Vec because we didn't mark it as public. Only the
107 | module that defines this function can call it. Also, `make_room` directly
108 | accesses the private fields of Vec, so it can only be written in the same module
109 | as Vec.
110 |
111 | It is therefore possible for us to write a completely safe abstraction that
112 | relies on complex invariants. This is *critical* to the relationship between
113 | Safe Rust and Unsafe Rust.
114 |
115 | We have already seen that Unsafe code must trust *some* Safe code, but shouldn't
116 | trust *generic* Safe code. Privacy is important to unsafe code for similar reasons:
117 | it prevents us from having to trust all the safe code in the universe from messing
118 | with our trusted state.
119 |
120 | Safety lives!
121 |
--------------------------------------------------------------------------------
/src/arc-mutex/arc-base.md:
--------------------------------------------------------------------------------
1 | # Base Code
2 |
3 | Now that we've decided the layout for our implementation of `Arc`, let's create
4 | some basic code.
5 |
6 | ## Constructing the Arc
7 |
8 | We'll first need a way to construct an `Arc`.
9 |
10 | This is pretty simple, as we just need to box the `ArcInner` and get a
11 | `NonNull` pointer to it.
12 |
13 |
14 | ```rust,ignore
15 | impl Arc {
16 | pub fn new(data: T) -> Arc {
17 | // We start the reference count at 1, as that first reference is the
18 | // current pointer.
19 | let boxed = Box::new(ArcInner {
20 | rc: AtomicUsize::new(1),
21 | data,
22 | });
23 | Arc {
24 | // It is okay to call `.unwrap()` here as we get a pointer from
25 | // `Box::into_raw` which is guaranteed to not be null.
26 | ptr: NonNull::new(Box::into_raw(boxed)).unwrap(),
27 | phantom: PhantomData,
28 | }
29 | }
30 | }
31 | ```
32 |
33 | ## Send and Sync
34 |
35 | Since we're building a concurrency primitive, we'll need to be able to send it
36 | across threads. Thus, we can implement the `Send` and `Sync` marker traits. For
37 | more information on these, see [the section on `Send` and
38 | `Sync`](../send-and-sync.md).
39 |
40 | This is okay because:
41 | * You can only get a mutable reference to the value inside an `Arc` if and only
42 | if it is the only `Arc` referencing that data (which only happens in `Drop`)
43 | * We use atomics for the shared mutable reference counting
44 |
45 |
46 | ```rust,ignore
47 | unsafe impl Send for Arc {}
48 | unsafe impl Sync for Arc {}
49 | ```
50 |
51 | We need to have the bound `T: Sync + Send` because if we did not provide those
52 | bounds, it would be possible to share values that are thread-unsafe across a
53 | thread boundary via an `Arc`, which could possibly cause data races or
54 | unsoundness.
55 |
56 | For example, if those bounds were not present, `Arc>` would be `Sync` or
57 | `Send`, meaning that you could clone the `Rc` out of the `Arc` to send it across
58 | a thread (without creating an entirely new `Rc`), which would create data races
59 | as `Rc` is not thread-safe.
60 |
61 | ## Getting the `ArcInner`
62 |
63 | To dereference the `NonNull` pointer into a `&T`, we can call
64 | `NonNull::as_ref`. This is unsafe, unlike the typical `as_ref` function, so we
65 | must call it like this:
66 |
67 |
68 | ```rust,ignore
69 | unsafe { self.ptr.as_ref() }
70 | ```
71 |
72 | We'll be using this snippet a few times in this code (usually with an associated
73 | `let` binding).
74 |
75 | This unsafety is okay because while this `Arc` is alive, we're guaranteed that
76 | the inner pointer is valid.
77 |
78 | ## Deref
79 |
80 | Alright. Now we can make `Arc`s (and soon will be able to clone and destroy them correctly), but how do we get
81 | to the data inside?
82 |
83 | What we need now is an implementation of `Deref`.
84 |
85 | We'll need to import the trait:
86 |
87 |
88 | ```rust,ignore
89 | use std::ops::Deref;
90 | ```
91 |
92 | And here's the implementation:
93 |
94 |
95 | ```rust,ignore
96 | impl Deref for Arc {
97 | type Target = T;
98 |
99 | fn deref(&self) -> &T {
100 | let inner = unsafe { self.ptr.as_ref() };
101 | &inner.data
102 | }
103 | }
104 | ```
105 |
106 | Pretty simple, eh? This simply dereferences the `NonNull` pointer to the
107 | `ArcInner`, then gets a reference to the data inside.
108 |
109 | ## Code
110 |
111 | Here's all the code from this section:
112 |
113 |
114 | ```rust,ignore
115 | use std::ops::Deref;
116 |
117 | impl Arc {
118 | pub fn new(data: T) -> Arc {
119 | // We start the reference count at 1, as that first reference is the
120 | // current pointer.
121 | let boxed = Box::new(ArcInner {
122 | rc: AtomicUsize::new(1),
123 | data,
124 | });
125 | Arc {
126 | // It is okay to call `.unwrap()` here as we get a pointer from
127 | // `Box::into_raw` which is guaranteed to not be null.
128 | ptr: NonNull::new(Box::into_raw(boxed)).unwrap(),
129 | phantom: PhantomData,
130 | }
131 | }
132 | }
133 |
134 | unsafe impl Send for Arc {}
135 | unsafe impl Sync for Arc {}
136 |
137 |
138 | impl Deref for Arc {
139 | type Target = T;
140 |
141 | fn deref(&self) -> &T {
142 | let inner = unsafe { self.ptr.as_ref() };
143 | &inner.data
144 | }
145 | }
146 | ```
147 |
--------------------------------------------------------------------------------
/src/vec/vec-raw.md:
--------------------------------------------------------------------------------
1 | # RawVec
2 |
3 | We've actually reached an interesting situation here: we've duplicated the logic
4 | for specifying a buffer and freeing its memory in Vec and IntoIter. Now that
5 | we've implemented it and identified *actual* logic duplication, this is a good
6 | time to perform some logic compression.
7 |
8 | We're going to abstract out the `(ptr, cap)` pair and give them the logic for
9 | allocating, growing, and freeing:
10 |
11 |
12 | ```rust,ignore
13 | struct RawVec {
14 | ptr: NonNull,
15 | cap: usize,
16 | }
17 |
18 | unsafe impl Send for RawVec {}
19 | unsafe impl Sync for RawVec {}
20 |
21 | impl RawVec {
22 | fn new() -> Self {
23 | assert!(mem::size_of::() != 0, "TODO: implement ZST support");
24 | RawVec {
25 | ptr: NonNull::dangling(),
26 | cap: 0,
27 | }
28 | }
29 |
30 | fn grow(&mut self) {
31 | // This can't overflow because we ensure self.cap <= isize::MAX.
32 | let new_cap = if self.cap == 0 { 1 } else { 2 * self.cap };
33 |
34 | // Layout::array checks that the number of bytes is <= usize::MAX,
35 | // but this is redundant since old_layout.size() <= isize::MAX,
36 | // so the `unwrap` should never fail.
37 | let new_layout = Layout::array::(new_cap).unwrap();
38 |
39 | // Ensure that the new allocation doesn't exceed `isize::MAX` bytes.
40 | assert!(new_layout.size() <= isize::MAX as usize, "Allocation too large");
41 |
42 | let new_ptr = if self.cap == 0 {
43 | unsafe { alloc::alloc(new_layout) }
44 | } else {
45 | let old_layout = Layout::array::(self.cap).unwrap();
46 | let old_ptr = self.ptr.as_ptr() as *mut u8;
47 | unsafe { alloc::realloc(old_ptr, old_layout, new_layout.size()) }
48 | };
49 |
50 | // If allocation fails, `new_ptr` will be null, in which case we abort.
51 | self.ptr = match NonNull::new(new_ptr as *mut T) {
52 | Some(p) => p,
53 | None => alloc::handle_alloc_error(new_layout),
54 | };
55 | self.cap = new_cap;
56 | }
57 | }
58 |
59 | impl Drop for RawVec {
60 | fn drop(&mut self) {
61 | if self.cap != 0 {
62 | let layout = Layout::array::(self.cap).unwrap();
63 | unsafe {
64 | alloc::dealloc(self.ptr.as_ptr() as *mut u8, layout);
65 | }
66 | }
67 | }
68 | }
69 | ```
70 |
71 | And change Vec as follows:
72 |
73 |
74 | ```rust,ignore
75 | pub struct Vec {
76 | buf: RawVec,
77 | len: usize,
78 | }
79 |
80 | impl Vec {
81 | fn ptr(&self) -> *mut T {
82 | self.buf.ptr.as_ptr()
83 | }
84 |
85 | fn cap(&self) -> usize {
86 | self.buf.cap
87 | }
88 |
89 | pub fn new() -> Self {
90 | Vec {
91 | buf: RawVec::new(),
92 | len: 0,
93 | }
94 | }
95 |
96 | // push/pop/insert/remove largely unchanged:
97 | // * `self.ptr.as_ptr() -> self.ptr()`
98 | // * `self.cap -> self.cap()`
99 | // * `self.grow() -> self.buf.grow()`
100 | }
101 |
102 | impl Drop for Vec {
103 | fn drop(&mut self) {
104 | while let Some(_) = self.pop() {}
105 | // deallocation is handled by RawVec
106 | }
107 | }
108 | ```
109 |
110 | And finally we can really simplify IntoIter:
111 |
112 |
113 | ```rust,ignore
114 | pub struct IntoIter {
115 | _buf: RawVec, // we don't actually care about this. Just need it to live.
116 | start: *const T,
117 | end: *const T,
118 | }
119 |
120 | // next and next_back literally unchanged since they never referred to the buf
121 |
122 | impl Drop for IntoIter {
123 | fn drop(&mut self) {
124 | // only need to ensure all our elements are read;
125 | // buffer will clean itself up afterwards.
126 | for _ in &mut *self {}
127 | }
128 | }
129 |
130 | impl IntoIterator for Vec {
131 | type Item = T;
132 | type IntoIter = IntoIter;
133 | fn into_iter(self) -> IntoIter {
134 | // need to use ptr::read to unsafely move the buf out since it's
135 | // not Copy, and Vec implements Drop (so we can't destructure it).
136 | let buf = unsafe { ptr::read(&self.buf) };
137 | let len = self.len;
138 | mem::forget(self);
139 |
140 | IntoIter {
141 | start: buf.ptr.as_ptr(),
142 | end: if buf.cap == 0 {
143 | // can't offset off of a pointer unless it's part of an allocation
144 | buf.ptr.as_ptr()
145 | } else {
146 | unsafe { buf.ptr.as_ptr().add(len) }
147 | },
148 | _buf: buf,
149 | }
150 | }
151 | }
152 | ```
153 |
154 | Much better.
155 |
--------------------------------------------------------------------------------
/src/vec/vec-drain.md:
--------------------------------------------------------------------------------
1 | # Drain
2 |
3 | Let's move on to Drain. Drain is largely the same as IntoIter, except that
4 | instead of consuming the Vec, it borrows the Vec and leaves its allocation
5 | untouched. For now we'll only implement the "basic" full-range version.
6 |
7 |
8 | ```rust,ignore
9 | use std::marker::PhantomData;
10 |
11 | struct Drain<'a, T: 'a> {
12 | // Need to bound the lifetime here, so we do it with `&'a mut Vec`
13 | // because that's semantically what we contain. We're "just" calling
14 | // `pop()` and `remove(0)`.
15 | vec: PhantomData<&'a mut Vec>,
16 | start: *const T,
17 | end: *const T,
18 | }
19 |
20 | impl<'a, T> Iterator for Drain<'a, T> {
21 | type Item = T;
22 | fn next(&mut self) -> Option {
23 | if self.start == self.end {
24 | None
25 | ```
26 |
27 | -- wait, this is seeming familiar. Let's do some more compression. Both
28 | IntoIter and Drain have the exact same structure, let's just factor it out.
29 |
30 |
31 | ```rust,ignore
32 | struct RawValIter {
33 | start: *const T,
34 | end: *const T,
35 | }
36 |
37 | impl RawValIter {
38 | // unsafe to construct because it has no associated lifetimes.
39 | // This is necessary to store a RawValIter in the same struct as
40 | // its actual allocation. OK since it's a private implementation
41 | // detail.
42 | unsafe fn new(slice: &[T]) -> Self {
43 | RawValIter {
44 | start: slice.as_ptr(),
45 | end: if slice.len() == 0 {
46 | // if `len = 0`, then this is not actually allocated memory.
47 | // Need to avoid offsetting because that will give wrong
48 | // information to LLVM via GEP.
49 | slice.as_ptr()
50 | } else {
51 | slice.as_ptr().add(slice.len())
52 | }
53 | }
54 | }
55 | }
56 |
57 | // Iterator and DoubleEndedIterator impls identical to IntoIter.
58 | ```
59 |
60 | And IntoIter becomes the following:
61 |
62 |
63 | ```rust,ignore
64 | pub struct IntoIter {
65 | _buf: RawVec, // we don't actually care about this. Just need it to live.
66 | iter: RawValIter,
67 | }
68 |
69 | impl Iterator for IntoIter {
70 | type Item = T;
71 | fn next(&mut self) -> Option { self.iter.next() }
72 | fn size_hint(&self) -> (usize, Option) { self.iter.size_hint() }
73 | }
74 |
75 | impl DoubleEndedIterator for IntoIter {
76 | fn next_back(&mut self) -> Option { self.iter.next_back() }
77 | }
78 |
79 | impl Drop for IntoIter {
80 | fn drop(&mut self) {
81 | for _ in &mut *self {}
82 | }
83 | }
84 |
85 | impl IntoIterator for Vec {
86 | type Item = T;
87 | type IntoIter = IntoIter;
88 | fn into_iter(self) -> IntoIter {
89 | unsafe {
90 | let iter = RawValIter::new(&self);
91 |
92 | let buf = ptr::read(&self.buf);
93 | mem::forget(self);
94 |
95 | IntoIter {
96 | iter,
97 | _buf: buf,
98 | }
99 | }
100 | }
101 | }
102 | ```
103 |
104 | Note that I've left a few quirks in this design to make upgrading Drain to work
105 | with arbitrary subranges a bit easier. In particular we *could* have RawValIter
106 | drain itself on drop, but that won't work right for a more complex Drain.
107 | We also take a slice to simplify Drain initialization.
108 |
109 | Alright, now Drain is really easy:
110 |
111 |
112 | ```rust,ignore
113 | use std::marker::PhantomData;
114 |
115 | pub struct Drain<'a, T: 'a> {
116 | vec: PhantomData<&'a mut Vec>,
117 | iter: RawValIter,
118 | }
119 |
120 | impl<'a, T> Iterator for Drain<'a, T> {
121 | type Item = T;
122 | fn next(&mut self) -> Option { self.iter.next() }
123 | fn size_hint(&self) -> (usize, Option) { self.iter.size_hint() }
124 | }
125 |
126 | impl<'a, T> DoubleEndedIterator for Drain<'a, T> {
127 | fn next_back(&mut self) -> Option { self.iter.next_back() }
128 | }
129 |
130 | impl<'a, T> Drop for Drain<'a, T> {
131 | fn drop(&mut self) {
132 | for _ in &mut *self {}
133 | }
134 | }
135 |
136 | impl Vec {
137 | pub fn drain(&mut self) -> Drain {
138 | let iter = unsafe { RawValIter::new(&self) };
139 |
140 | // this is a mem::forget safety thing. If Drain is forgotten, we just
141 | // leak the whole Vec's contents. Also we need to do this *eventually*
142 | // anyway, so why not do it now?
143 | self.len = 0;
144 |
145 | Drain {
146 | iter,
147 | vec: PhantomData,
148 | }
149 | }
150 | }
151 | ```
152 |
153 | For more details on the `mem::forget` problem, see the
154 | [section on leaks][leaks].
155 |
156 | [leaks]: ../leaking.html
157 |
--------------------------------------------------------------------------------
/src/vec/vec-into-iter.md:
--------------------------------------------------------------------------------
1 | # IntoIter
2 |
3 | Let's move on to writing iterators. `iter` and `iter_mut` have already been
4 | written for us thanks to The Magic of Deref. However there's two interesting
5 | iterators that Vec provides that slices can't: `into_iter` and `drain`.
6 |
7 | IntoIter consumes the Vec by-value, and can consequently yield its elements
8 | by-value. In order to enable this, IntoIter needs to take control of Vec's
9 | allocation.
10 |
11 | IntoIter needs to be DoubleEnded as well, to enable reading from both ends.
12 | Reading from the back could just be implemented as calling `pop`, but reading
13 | from the front is harder. We could call `remove(0)` but that would be insanely
14 | expensive. Instead we're going to just use ptr::read to copy values out of
15 | either end of the Vec without mutating the buffer at all.
16 |
17 | To do this we're going to use a very common C idiom for array iteration. We'll
18 | make two pointers; one that points to the start of the array, and one that
19 | points to one-element past the end. When we want an element from one end, we'll
20 | read out the value pointed to at that end and move the pointer over by one. When
21 | the two pointers are equal, we know we're done.
22 |
23 | Note that the order of read and offset are reversed for `next` and `next_back`
24 | For `next_back` the pointer is always after the element it wants to read next,
25 | while for `next` the pointer is always at the element it wants to read next.
26 | To see why this is, consider the case where every element but one has been
27 | yielded.
28 |
29 | The array looks like this:
30 |
31 | ```text
32 | S E
33 | [X, X, X, O, X, X, X]
34 | ```
35 |
36 | If E pointed directly at the element it wanted to yield next, it would be
37 | indistinguishable from the case where there are no more elements to yield.
38 |
39 | Although we don't actually care about it during iteration, we also need to hold
40 | onto the Vec's allocation information in order to free it once IntoIter is
41 | dropped.
42 |
43 | So we're going to use the following struct:
44 |
45 |
46 | ```rust,ignore
47 | pub struct IntoIter {
48 | buf: NonNull,
49 | cap: usize,
50 | start: *const T,
51 | end: *const T,
52 | }
53 | ```
54 |
55 | And this is what we end up with for initialization:
56 |
57 |
58 | ```rust,ignore
59 | impl IntoIterator for Vec {
60 | type Item = T;
61 | type IntoIter = IntoIter;
62 | fn into_iter(self) -> IntoIter {
63 | // Make sure not to drop Vec since that would free the buffer
64 | let vec = ManuallyDrop::new(self);
65 |
66 | // Can't destructure Vec since it's Drop
67 | let ptr = vec.ptr;
68 | let cap = vec.cap;
69 | let len = vec.len;
70 |
71 | IntoIter {
72 | buf: ptr,
73 | cap,
74 | start: ptr.as_ptr(),
75 | end: if cap == 0 {
76 | // can't offset off this pointer, it's not allocated!
77 | ptr.as_ptr()
78 | } else {
79 | unsafe { ptr.as_ptr().add(len) }
80 | },
81 | }
82 | }
83 | }
84 | ```
85 |
86 | Here's iterating forward:
87 |
88 |
89 | ```rust,ignore
90 | impl Iterator for IntoIter {
91 | type Item = T;
92 | fn next(&mut self) -> Option {
93 | if self.start == self.end {
94 | None
95 | } else {
96 | unsafe {
97 | let result = ptr::read(self.start);
98 | self.start = self.start.offset(1);
99 | Some(result)
100 | }
101 | }
102 | }
103 |
104 | fn size_hint(&self) -> (usize, Option) {
105 | let len = (self.end as usize - self.start as usize)
106 | / mem::size_of::();
107 | (len, Some(len))
108 | }
109 | }
110 | ```
111 |
112 | And here's iterating backwards.
113 |
114 |
115 | ```rust,ignore
116 | impl DoubleEndedIterator for IntoIter {
117 | fn next_back(&mut self) -> Option {
118 | if self.start == self.end {
119 | None
120 | } else {
121 | unsafe {
122 | self.end = self.end.offset(-1);
123 | Some(ptr::read(self.end))
124 | }
125 | }
126 | }
127 | }
128 | ```
129 |
130 | Because IntoIter takes ownership of its allocation, it needs to implement Drop
131 | to free it. However it also wants to implement Drop to drop any elements it
132 | contains that weren't yielded.
133 |
134 |
135 | ```rust,ignore
136 | impl Drop for IntoIter {
137 | fn drop(&mut self) {
138 | if self.cap != 0 {
139 | // drop any remaining elements
140 | for _ in &mut *self {}
141 | let layout = Layout::array::(self.cap).unwrap();
142 | unsafe {
143 | alloc::dealloc(self.buf.as_ptr() as *mut u8, layout);
144 | }
145 | }
146 | }
147 | }
148 | ```
149 |
--------------------------------------------------------------------------------
/src/what-unsafe-does.md:
--------------------------------------------------------------------------------
1 | # What Unsafe Rust Can Do
2 |
3 | The only things that are different in Unsafe Rust are that you can:
4 |
5 | * Dereference raw pointers
6 | * Call `unsafe` functions (including C functions, compiler intrinsics, and the raw allocator)
7 | * Implement `unsafe` traits
8 | * Access or modify mutable statics
9 | * Access fields of `union`s
10 |
11 | That's it. The reason these operations are relegated to Unsafe is that misusing
12 | any of these things will cause the ever dreaded Undefined Behavior. Invoking
13 | Undefined Behavior gives the compiler full rights to do arbitrarily bad things
14 | to your program. You definitely *should not* invoke Undefined Behavior.
15 |
16 | Unlike C, Undefined Behavior is pretty limited in scope in Rust. All the core
17 | language cares about is preventing the following things:
18 |
19 | * Dereferencing (using the `*` operator on) dangling or unaligned pointers (see below)
20 | * Breaking the [pointer aliasing rules][]
21 | * Calling a function with the wrong call ABI or unwinding from a function with the wrong unwind ABI.
22 | * Causing a [data race][race]
23 | * Executing code compiled with [target features][] that the current thread of execution does
24 | not support
25 | * Producing invalid values (either alone or as a field of a compound type such
26 | as `enum`/`struct`/array/tuple):
27 | * a `bool` that isn't 0 or 1
28 | * an `enum` with an invalid discriminant
29 | * a null `fn` pointer
30 | * a `char` outside the ranges [0x0, 0xD7FF] and [0xE000, 0x10FFFF]
31 | * a `!` (all values are invalid for this type)
32 | * an integer (`i*`/`u*`), floating point value (`f*`), or raw pointer read from
33 | [uninitialized memory][], or uninitialized memory in a `str`.
34 | * a reference/`Box` that is dangling, unaligned, or points to an invalid value.
35 | * a wide reference, `Box`, or raw pointer that has invalid metadata:
36 | * `dyn Trait` metadata is invalid if it is not a pointer to a vtable for
37 | `Trait` that matches the actual dynamic trait the pointer or reference points to
38 | * slice metadata is invalid if the length is not a valid `usize`
39 | (i.e., it must not be read from uninitialized memory)
40 | * a type with custom invalid values that is one of those values, such as a
41 | [`NonNull`] that is null. (Requesting custom invalid values is an unstable
42 | feature, but some stable libstd types, like `NonNull`, make use of it.)
43 |
44 | For a more detailed explanation about "Undefined Behavior", you may refer to
45 | [the reference][behavior-considered-undefined].
46 |
47 | "Producing" a value happens any time a value is assigned, passed to a
48 | function/primitive operation or returned from a function/primitive operation.
49 |
50 | A reference/pointer is "dangling" if it is null or not all of the bytes it
51 | points to are part of the same allocation (so in particular they all have to be
52 | part of *some* allocation). The span of bytes it points to is determined by the
53 | pointer value and the size of the pointee type. As a consequence, if the span is
54 | empty, "dangling" is the same as "null". Note that slices and strings point
55 | to their entire range, so it's important that the length metadata is never too
56 | large (in particular, allocations and therefore slices and strings cannot be
57 | bigger than `isize::MAX` bytes). If for some reason this is too cumbersome,
58 | consider using raw pointers.
59 |
60 | That's it. That's all the causes of Undefined Behavior baked into Rust. Of
61 | course, unsafe functions and traits are free to declare arbitrary other
62 | constraints that a program must maintain to avoid Undefined Behavior. For
63 | instance, the allocator APIs declare that deallocating unallocated memory is
64 | Undefined Behavior.
65 |
66 | However, violations of these constraints generally will just transitively lead to one of
67 | the above problems. Some additional constraints may also derive from compiler
68 | intrinsics that make special assumptions about how code can be optimized. For instance,
69 | Vec and Box make use of intrinsics that require their pointers to be non-null at all times.
70 |
71 | Rust is otherwise quite permissive with respect to other dubious operations.
72 | Rust considers it "safe" to:
73 |
74 | * Deadlock
75 | * Have a [race condition][race]
76 | * Leak memory
77 | * Overflow integers (with the built-in operators such as `+` etc.)
78 | * Abort the program
79 | * Delete the production database
80 |
81 | For more detailed information, you may refer to [the reference][behavior-not-considered-unsafe].
82 |
83 | However any program that actually manages to do such a thing is *probably*
84 | incorrect. Rust provides lots of tools to make these things rare, but
85 | these problems are considered impractical to categorically prevent.
86 |
87 | [pointer aliasing rules]: references.html
88 | [uninitialized memory]: uninitialized.html
89 | [race]: races.html
90 | [target features]: ../reference/attributes/codegen.html#the-target_feature-attribute
91 | [`NonNull`]: ../std/ptr/struct.NonNull.html
92 | [behavior-considered-undefined]: ../reference/behavior-considered-undefined.html
93 | [behavior-not-considered-unsafe]: ../reference/behavior-not-considered-unsafe.html
94 |
--------------------------------------------------------------------------------
/src/repr-rust.md:
--------------------------------------------------------------------------------
1 | # repr(Rust)
2 |
3 | First and foremost, all types have an alignment specified in bytes. The
4 | alignment of a type specifies what addresses are valid to store the value at. A
5 | value with alignment `n` must only be stored at an address that is a multiple of
6 | `n`. So alignment 2 means you must be stored at an even address, and 1 means
7 | that you can be stored anywhere. Alignment is at least 1, and always a power
8 | of 2.
9 |
10 | Primitives are usually aligned to their size, although this is
11 | platform-specific behavior. For example, on x86 `u64` and `f64` are often
12 | aligned to 4 bytes (32 bits).
13 |
14 | A type's size must always be a multiple of its alignment (Zero being a valid size
15 | for any alignment). This ensures that an array of that type may always be indexed
16 | by offsetting by a multiple of its size. Note that the size and alignment of a
17 | type may not be known statically in the case of [dynamically sized types][dst].
18 |
19 | Rust gives you the following ways to lay out composite data:
20 |
21 | * structs (named product types)
22 | * tuples (anonymous product types)
23 | * arrays (homogeneous product types)
24 | * enums (named sum types -- tagged unions)
25 | * unions (untagged unions)
26 |
27 | An enum is said to be *field-less* if none of its variants have associated data.
28 |
29 | By default, composite structures have an alignment equal to the maximum
30 | of their fields' alignments. Rust will consequently insert padding where
31 | necessary to ensure that all fields are properly aligned and that the overall
32 | type's size is a multiple of its alignment. For instance:
33 |
34 | ```rust
35 | struct A {
36 | a: u8,
37 | b: u32,
38 | c: u16,
39 | }
40 | ```
41 |
42 | will be 32-bit aligned on a target that aligns these primitives to their
43 | respective sizes. The whole struct will therefore have a size that is a multiple
44 | of 32-bits. It may become:
45 |
46 | ```rust
47 | struct A {
48 | a: u8,
49 | _pad1: [u8; 3], // to align `b`
50 | b: u32,
51 | c: u16,
52 | _pad2: [u8; 2], // to make overall size multiple of 4
53 | }
54 | ```
55 |
56 | or maybe:
57 |
58 | ```rust
59 | struct A {
60 | b: u32,
61 | c: u16,
62 | a: u8,
63 | _pad: u8,
64 | }
65 | ```
66 |
67 | There is *no indirection* for these types; all data is stored within the struct,
68 | as you would expect in C. However with the exception of arrays (which are
69 | densely packed and in-order), the layout of data is not specified by default.
70 | Given the two following struct definitions:
71 |
72 | ```rust
73 | struct A {
74 | a: i32,
75 | b: u64,
76 | }
77 |
78 | struct B {
79 | a: i32,
80 | b: u64,
81 | }
82 | ```
83 |
84 | Rust *does* guarantee that two instances of A have their data laid out in
85 | exactly the same way. However Rust *does not* currently guarantee that an
86 | instance of A has the same field ordering or padding as an instance of B.
87 |
88 | With A and B as written, this point would seem to be pedantic, but several other
89 | features of Rust make it desirable for the language to play with data layout in
90 | complex ways.
91 |
92 | For instance, consider this struct:
93 |
94 | ```rust
95 | struct Foo {
96 | count: u16,
97 | data1: T,
98 | data2: U,
99 | }
100 | ```
101 |
102 | Now consider the monomorphizations of `Foo` and `Foo`. If
103 | Rust lays out the fields in the order specified, we expect it to pad the
104 | values in the struct to satisfy their alignment requirements. So if Rust
105 | didn't reorder fields, we would expect it to produce the following:
106 |
107 |
108 | ```rust,ignore
109 | struct Foo {
110 | count: u16,
111 | data1: u16,
112 | data2: u32,
113 | }
114 |
115 | struct Foo {
116 | count: u16,
117 | _pad1: u16,
118 | data1: u32,
119 | data2: u16,
120 | _pad2: u16,
121 | }
122 | ```
123 |
124 | The latter case quite simply wastes space. An optimal use of space
125 | requires different monomorphizations to have *different field orderings*.
126 |
127 | Enums make this consideration even more complicated. Naively, an enum such as:
128 |
129 | ```rust
130 | enum Foo {
131 | A(u32),
132 | B(u64),
133 | C(u8),
134 | }
135 | ```
136 |
137 | might be laid out as:
138 |
139 | ```rust
140 | struct FooRepr {
141 | data: u64, // this is either a u64, u32, or u8 based on `tag`
142 | tag: u8, // 0 = A, 1 = B, 2 = C
143 | }
144 | ```
145 |
146 | And indeed this is approximately how it would be laid out (modulo the
147 | size and position of `tag`).
148 |
149 | However there are several cases where such a representation is inefficient. The
150 | classic case of this is Rust's "null pointer optimization": an enum consisting
151 | of a single outer unit variant (e.g. `None`) and a (potentially nested) non-
152 | nullable pointer variant (e.g. `Some(&T)`) makes the tag unnecessary. A null
153 | pointer can safely be interpreted as the unit (`None`) variant. The net
154 | result is that, for example, `size_of::