└── README.md /README.md: -------------------------------------------------------------------------------- 1 | # :fire: Rust Programming Tipz :fire: 2 | 3 | A collection of software engineering techniques for effectively expressing intent with Rust. 4 | 5 | - [Cleanup](#cleanup) 6 | * [Combating Rightward Pressure](#combating-rightward-pressure) 7 | + [Basics of De-nesting](#basics-of-de-nesting) 8 | + [Tuple Matching](#tuple-matching) 9 | * [Iteration Issues](#iteration-issues) 10 | + [Pulling the First Error out of an Iterator over `Result`s](#pulling-the-first-error-out-of-an-iterator-over-results) 11 | + [Reverse Iterator Ranges](#reverse-iterator-ranges) 12 | + [Empty and Singular Iterators](#empty-and-singular-iterators) 13 | + [Tuple Structs and Enum Tuple Variants as Functions](#tuple-structs-and-enum-tuple-variants-as-functions) 14 | - [Blocks for Clarity](#blocks-for-clarity) 15 | * [Closure Capture](#closure-capture) 16 | - [Ergonomics](#ergonomics) 17 | * [Unification and Reading the Error Messages That Matter](#unification-and-reading-the-error-messages-that-matter) 18 | * [Write-Compile-Fix Loop Latency](#write-compile-fix-loop-latency) 19 | * [Caching with sccache](#caching-with-sccache) 20 | * [Editor support for jumping to compiler errors](#editor-support-for-jumping-to-compiler-errors) 21 | - [Lockdown](#lockdown) 22 | + [Never](#never) 23 | + [Deactivating Mutability](#deactivating-mutability) 24 | - [Avoiding Limitations](#avoiding-limitations) 25 | + [`Box`](#boxfnonce) 26 | + [Shared Reference Swap Trick](#shared-reference-swap-trick) 27 | + [Using Sets As Maps](#using-sets-as-maps) 28 | 29 | # Cleanup 30 | 31 | This section is about improving clarity. 32 | 33 | ## Combating Rightward Pressure 34 | 35 | After sparring with the compiler, it's not unusual to stand back and see several nested combinator chains or match statements. Much of the art of writing clean Rust has to do with the judicious application of de-nesting techniques. 36 | 37 | ### Basics of De-nesting 38 | 39 | * Use `?` to flatten error handling, but be careful not to convert errors into top-level enums unless it makes sense to handle them at the same point in your code. Keep separate concerns in separate types. 40 | * Split combinator chains apart when they grow beyond one line. Assign useful names to the intermediate steps. In many cases, a multi-line combinator chain can be more clearly rewritten as a for-loop. 41 | * pattern match on the full complex type instead of using nested match statements 42 | * If your match statement only has a single pattern that you care about, followed by a wildcard, replace the match statement with an `if let My(Match(Pattern(thing))) = matched_thing { /*...*/ }` possibly with an `else` branch if you cared about the wildcard earlier. 43 | * Run cargo clippy! It can provide many legitimately helpful suggestions for cleaning up your code 44 | 45 | ### Tuple Matching 46 | 47 | If you find yourself writing code that looks like: 48 | 49 | ```rust 50 | let a = Some(5); 51 | let b = Some(false); 52 | 53 | let c = match a { 54 | Some(a) => { 55 | match b { 56 | Some(b) => whatever, 57 | None => other_thing, 58 | } 59 | } 60 | None => { 61 | match b { 62 | Some(b) => another_thing, 63 | None => a_fourth_thing, 64 | } 65 | } 66 | }; 67 | ``` 68 | 69 | it can be de-nested by doing a tuple match: 70 | 71 | ```rust 72 | let a = Some(5); 73 | let b = Some(false); 74 | 75 | let c = match (a, b) { 76 | (Some(a), Some(b)) => whatever, 77 | (Some(a), None) => other_thing, 78 | (None, Some(b)) => another_thing, 79 | (None, None) => a_fourth_thing, 80 | }; 81 | ``` 82 | 83 | As a special case, matching on tuples of booleans can be used to encode decision tables. 84 | For example, here's roughly how `cargo new` handles `--bin` and `--lib` arguments: 85 | 86 | ```rust 87 | let kind = match (args.is_present("bin"), args.is_present("lib")) { 88 | (true, true) => failure::bail!("can't specify both lib and binary outputs"), 89 | (false, true) => NewProjectKind::Lib, 90 | // default to bin 91 | (_, false) => NewProjectKind::Bin, 92 | }; 93 | ``` 94 | ## Iteration Issues 95 | 96 | ### Pulling the First Error out of an Iterator over `Result`s 97 | 98 | The `collect` method is extremely powerful, and if you have an iterator of `Result` types, you can use it to either return a collection of the `Ok` items, or the very first `Err` item. 99 | 100 | From the [std docs on collect](https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.collect): 101 | 102 | ```rust 103 | let results = [Ok(1), Err("nope"), Ok(3), Err("bad")]; 104 | 105 | let result: Result, &str> = results.iter().cloned().collect(); 106 | 107 | // gives us the first error 108 | assert_eq!(Err("nope"), result); 109 | 110 | let results = [Ok(1), Ok(3)]; 111 | 112 | let result: Result, &str> = results.iter().cloned().collect(); 113 | 114 | // gives us the list of answers 115 | assert_eq!(Ok(vec![1, 3]), result); 116 | ``` 117 | 118 | [Seen in Sunjay's tweet](https://twitter.com/Sunjay03/status/1051689563683545088) 119 | 120 | This functionality is unlocked by the `Result` type implementing `FromIterator> for Result` where `V` implements `FromIterator`. This may look a bit hairy, but it means that `collect` (which relies on the `FromIterator` trait for its functionality) can output a `Result` where the success type is some collection that can be built from `A` - the success type of the original results. The error type `E` is the same in both, meaning that the returned `Result` will just return the first encountered error. 121 | 122 | ### Reverse Iterator Ranges 123 | 124 | In Rust, we can write `for item in 0..50` to go from 0 to 49 but what if we wanted to iterate from 49 to 0? Many of us have written `for item in 50..0` and been surprised that nothing happened. Instead, we can write: 125 | 126 | ```rust 127 | // iterate from 49 to 0 128 | for item in (0..50).rev() {} 129 | 130 | // iterate from 50 to 0 131 | for item in (0..=50).rev() {} 132 | 133 | // iterate from 50 to 1 134 | for item in (1..=50).rev() {} 135 | ``` 136 | 137 | Under the hood, when we write a range with this syntax, we are constructing a [`RangeInclusive`](https://doc.rust-lang.org/std/ops/struct.RangeInclusive.html) instead of the normal [`Range`](https://doc.rust-lang.org/std/ops/struct.Range.html). You can also construct ranges for everything with `..`, or have a range be half-open like `..50` or `..=50` or `0..`. 138 | 139 | [Seen in Andrea Pessino's tweet](https://twitter.com/AndreaPessino/status/1113212969649725440) 140 | 141 | ### Empty and Singular Iterators 142 | 143 | The standard library also includes helpers for empty and singular iterators, using the functions `std::iter::empty` and `std::iter::once`, which can be a small cleanup of common code like `vec![].into_iter()` or `vec![my_item].into_iter()`. 144 | 145 | ### Tuple Structs and Enum Tuple Variants as Functions 146 | 147 | You may have received an error message at some point when you wrote an enum variant, but not the members inside it, and it complained about how you supplied a function instead of an enum: 148 | 149 | ```rust 150 | enum E { 151 | A(u64), 152 | } 153 | 154 | // ERROR: expected enum `E`, found `fn(u64) -> E {E::A}` 155 | let a: E = E::A; 156 | ``` 157 | 158 | Well, it turns out that enum tuple variants as well as tuple structs can be used as functions from their members to an instance of that enum or struct. This can be used to encapsulate items in a collection inside that object: 159 | 160 | ```rust 161 | // create a vector of E::A's using the variant as a constructor function 162 | let v_of_es: Vec = (0..50).map(E::A).collect(); 163 | 164 | // v_of_es is now vec![A(0), A(1), A(2), A(3), A(4), ..] 165 | 166 | // create a vector of Options using Some as a constructor function 167 | let v_of_options: Vec> = (0..50).map(Some).collect(); 168 | 169 | struct B(u64); 170 | 171 | // create a vector of B's using the struct as a constructor function 172 | let v_of_bs: Vec = (0..50).map(B).collect(); 173 | ``` 174 | 175 | # Blocks for Clarity 176 | 177 | Blocks allow us to detangle complex expressions, and can be used anywhere that a one-liner expression would be valid. 178 | 179 | ## Closure Capture 180 | 181 | Specifying variables for use in a closure can be frustrating, and it's common to see code that jumps through hoops to avoid shadowing variables. This is quite common when cloning an `Arc` before spawning a new thread that will own it. But a closure definition is an expression. Anywhere a closure is accepted, we could use a block that evaluates to a closure. In the example below, we use blocks to avoid shadowing the config that we want to pass to several threads, without creating gross names like `config1`, `config2` etc... Seen in 182 | [salsa](https://github.com/salsa-rs/salsa/blob/3dc4539c7c34cb12b5d4d1bb0706324cfcaaa7ae/tests/parallel/cancellation.rs#L42-L53) and described in more detail in [Rust pattern: Precise closure capture clauses](http://smallcultfollowing.com/babysteps/blog/2018/04/24/rust-pattern-precise-closure-capture-clauses/#a-more-general-pattern). 183 | 184 | Before, painfully avoiding shadowing config: 185 | 186 | ```rust 187 | fn spawn_threads(config: Arc) { 188 | let config1 = Arc::clone(&config); 189 | thread::spawn(move || do_x(config1)); 190 | 191 | let config2 = Arc::clone(&config); 192 | thread::spawn(move || do_y(config2)); 193 | } 194 | ``` 195 | 196 | After, no need to invent config_n names: 197 | 198 | ```rust 199 | fn spawn_threads(config: Arc) { 200 | thread::spawn({ 201 | let config = Arc::clone(&config); 202 | move || do_x(config) 203 | }); 204 | 205 | thread::spawn({ 206 | let config = Arc::clone(&config); 207 | move || do_y(config) 208 | }); 209 | } 210 | ``` 211 | 212 | # Ergonomics 213 | 214 | One of the most important aspects of feeling productive with the Rust programming language is to find harmony with the compiler. We've all introduced a single error and been whipped in the face by dozens of error messages. Even after years of professional Rust usage, it can feel like a cause for celebration when there are no errors after introducing more than a few new lines of code. Remember that the strictness of the compiler is what gives us so much freedom. Rust is useful for building back-ends, front-ends, embedded systems, databases, and so much more because the compiler knows how long our variables are valid for without using a garbage collector at runtime. Any lifetime-related bug that fails to compile in Rust might have been an exploitable memory corruption issue in C or C++. The compiler pain frees us from exploitation and gives us the ability to work on a wider range of projects. 215 | 216 | ## Unification and Reading the Error Messages That Matter 217 | 218 | Rust requires that arguments and return types are made explicit in function definitions. The compiler will use these explicit types at the boundaries of a function to drive type inference. It will take the input argument types and work from the top of the function toward the bottom. It will take the return type and work its way up. Hopefully they can meet in the middle. The process under the hood is actually a [little more complicated than this](http://smallcultfollowing.com/babysteps/blog/2017/03/25/unification-in-chalk-part-1/) but this simplified model is adequate to reason about this particular subject. The point is, there has to be an unbroken chain of type evidence that connects the input arguments to the return type through the body. When there is a gap in the chain, all ambiguous types will turn into errors. This is partially why rust will emit many pages of errors sometimes when there's actually only a single thing that needs to be fixed. 219 | 220 | A big part of avoiding compiler fatigue is to just filter out the errors that don't matter. Start with the first one, and work your way down. See the next section for a command that will do this automatically when your code changes. 221 | 222 | ## Write-Compile-Fix Loop Latency 223 | 224 | Programming Rust is a long game. It's common to see beginners spending lots of energy switching back and forth between their editor and a terminal to run rustc, and then scrolling around to find the next error that they want to fix. This is high-friction, and will tire you out faster than if it were automated. 225 | 226 | There is a cargo plugin called `cargo watch` that will look for changes in source files descendent from the current working directory, and then run `cargo check` which skips the LLVM codegen and only looks for compilation errors in your Rust code. It can be installed by typing `cargo install cargo-watch`. 227 | 228 | You can use the `cargo watch` plugin to call a specific command when your code changes as well. I like to filter out the lines after the beginning of the error messages, after clearing the terminal: 229 | 230 | ```bash 231 | cargo watch -s 'clear; cargo check --tests --color=always 2>&1 | head -40' 232 | ``` 233 | 234 | This way I just save my code and it shows the next error. 235 | 236 | ## Caching with sccache 237 | 238 | `sccache` is a tool written by Mozilla that supports `ccache`-style build caching for Rust. This is particularly useful if you frequently clean and build projects with lots of dependencies, as normally they would all need to be recompiled, but with `sccache` they will be stored and used to back a cache that is accessible while building any project on your system. It takes care to do the right thing in the presence of different feature flags, versions etc... For projects with lots of dependencies it can make a huge difference over time. 239 | 240 | The installation is simple: 241 | 242 | ``` 243 | cargo install sccache 244 | export RUSTC_WRAPPER=sccache 245 | ``` 246 | 247 | It can also be configured to use a remote cache like s3, memcached, redis, etc... which is quite useful for building speedy CI clusters. 248 | 249 | ## Editor support for jumping to compiler errors 250 | 251 | To go even farther than the last section, most editors have support for jumping to the next Rust error. In vim, you can use the `vim.rust` plugin in combination with Syntastic to automatically run rustc when you save a file, and to jump to errors using a keybind. Emacs users can use `flycheck-rust` for similar functionality. 252 | 253 | # Lockdown 254 | 255 | This section is about preventing undesirable usage. 256 | 257 | ### Never 258 | 259 | To make a type that can never be created, simply create an empty enum. Use this where you want to represent something that should never actually exist, but a placeholder is required. This is being brought [into the standard library](https://doc.rust-lang.org/std/primitive.never.html) piece by piece, and it's already possible to have a function return `!` if it exits the process or ends in an infinite loop (important for embedded work where main should never return). This can also be used when working with a Result that will never actually be an `Err` but you need to adhere to that interface. 260 | 261 | ```rust 262 | enum Never {} 263 | 264 | let never = Never:: // oh yeah, can't actually create one... 265 | ``` 266 | 267 | ### Deactivating Mutability 268 | Here's a pattern for disabling mutability for "finalized" objects, even in mutable owned copies of a thing, preventing misuse. Done by wrapping it in a newtype with a private inner value that implements Deref but not DerefMut: 269 | 270 | ```rust 271 | mod config { 272 | #[derive(Clone, Debug, PartialOrd, Ord, Eq, PartialEq)] 273 | pub struct Immutable(T); 274 | 275 | impl Copy for Immutable where T: Copy {} 276 | 277 | impl std::ops::Deref for Immutable { 278 | type Target = T; 279 | 280 | fn deref(&self) -> &T { 281 | &self.0 282 | } 283 | } 284 | 285 | #[derive(Default)] 286 | pub struct Config { 287 | pub a: usize, 288 | pub b: String, 289 | } 290 | 291 | impl Config { 292 | pub fn build(self) -> Immutable { 293 | Immutable(self) 294 | } 295 | } 296 | } 297 | 298 | use config::Config; 299 | 300 | fn main() { 301 | let mut under_construction = Config { 302 | a: 5, 303 | b: "yo".into(), 304 | }; 305 | 306 | under_construction.a = 6; 307 | 308 | let finalized = under_construction.build(); 309 | 310 | // at this point, you can make tons of copies, 311 | // and even if somebody has an owned local version, 312 | // they won't be able to accidentally change some 313 | // configuration that 314 | println!("finalized.a: {}", finalized.a); 315 | 316 | let mut finalized = finalized; 317 | 318 | // the below WON'T work bwahahaha 319 | // finalized.a = 666; 320 | // finalized.0.a = 666; 321 | } 322 | ``` 323 | 324 | # Avoiding Limitations 325 | 326 | ### `Box` 327 | 328 | **Obsolete** since Rust 1.35 ([release notes]), `Box` just works now. 329 | 330 | [release notes]: https://github.com/rust-lang/rust/blob/master/RELEASES.md#version-1350-2019-05-23 331 | 332 | Currently, it's not possible to call `Box R>` on stable Rust. The 333 | common workaround is to use `Box R>`, store internal state inside of 334 | an `Option` and `take` the state out (with a potential run-time panic) in the 335 | call. However, a solution that statically guarantees that fn can be called at 336 | most once is possible. Seen in [Cargo](https://github.com/rust-lang/cargo/blob/dc83ead224d8622f748f507574e1448a28d8dcc7/src/cargo/core/compiler/job.rs#L17-L25). 337 | 338 | ```rust 339 | trait FnBox { 340 | fn call_box(self: Box, a: A) -> R; 341 | } 342 | 343 | impl R> FnBox for F { 344 | fn call_box(self: Box, a: A) -> R { 345 | (*self)(a) 346 | } 347 | } 348 | 349 | fn demo(f: Box>) -> String { 350 | f.call_box(()) 351 | } 352 | 353 | #[test] 354 | fn test_demo() { 355 | let hello = "hello".to_string(); 356 | let f: Box> = Box::new(move |()| hello); 357 | assert_eq!(&demo(f), "hello"); 358 | } 359 | ``` 360 | 361 | Note that `self: Box` is stable and object-safe. 362 | 363 | ### Shared Reference Swap Trick 364 | 365 | `std::cell::Cell` is most commonly used with `Copy` types, because 366 | `Cell::::get` method requires `T: Copy`. However, a `Cell` can be useful with 367 | non-copy types as well, thanks to these two methods: 368 | 369 | ```rust 370 | fn replace(&self, val: T) -> T; 371 | 372 | fn take(&self) -> T 373 | where 374 | T: Default 375 | ; 376 | ``` 377 | 378 | In particular, using a `Cell` one can implement an analogue of 379 | `std::mem::swap` (swap trick) or `std::mem::replace` ([Jones's trick][trick]) which 380 | doesn't need a `&mut` reference. 381 | 382 | [trick]: http://giphygifs.s3.amazonaws.com/media/MS0fQBmGGMaRy/giphy.gif 383 | 384 | The following example uses `Cell::take` to implement `fmt::Display` for the 385 | iterator. `fmt::Display` has only `&self`, but we need to consume the iterator 386 | to print it. `Cell` allows us to do exactly that: 387 | 388 | ```rust 389 | use std::{cell::Cell, fmt}; 390 | 391 | fn display_iter(xs: I) -> impl fmt::Display 392 | where 393 | I: Iterator, 394 | I::Item: fmt::Display, 395 | { 396 | struct IterFmt(Cell>); 397 | 398 | impl fmt::Display for IterFmt 399 | where 400 | I: Iterator, 401 | I::Item: fmt::Display, 402 | { 403 | fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { 404 | // Advanced Jones's trick: `mem::replace` with `&` reference! 405 | let xs: Option = self.0.take(); 406 | let xs: I = xs.unwrap(); 407 | 408 | let mut first = true; 409 | for item in xs { 410 | if !first { 411 | f.write_str(", ")? 412 | } 413 | first = false; 414 | fmt::Display::fmt(&item, f)? 415 | } 416 | 417 | Ok(()) 418 | } 419 | } 420 | 421 | IterFmt(Cell::new(Some(xs))) 422 | } 423 | 424 | fn main() { 425 | let xs = vec![1, 2, 3].into_iter(); 426 | assert_eq!(display_iter(xs).to_string(), "1, 2, 3"); 427 | } 428 | ``` 429 | 430 | First seen in [rustc](https://github.com/rust-lang/rust/blob/6b5f9b2e973e438fc1726a2d164d046acd80b170/src/librustdoc/html/format.rs#L1061). 431 | 432 | ### Using Sets As Maps 433 | 434 | Often, you want to store structs in an associative container keyed by a field of 435 | the struct. For example 436 | 437 | ```rust 438 | use std::collections::HashMap; 439 | 440 | struct Person { 441 | name: String, 442 | age: u32, 443 | } 444 | 445 | fn persons_by_name(persons: Vec) -> HashMap { 446 | persons.into_iter() 447 | .map(|p| (p.name.clone(), p)) 448 | .collect() 449 | } 450 | ``` 451 | 452 | The simplest way to do this is to clone the keys, like in the example above. 453 | The drawback of this approach is that the container contains two copies of each key, which might be a problem if the keys are large or are not `Clone`. 454 | 455 | It seems like it should be possible to borrow the key from the struct, but doing this in a straight forward way fails: 456 | 457 | ```rust 458 | fn persons_by_name(persons: Vec) -> HashMap<'??? str, Person> 459 | ``` 460 | 461 | There isn't any lifetime in sight you can use instead of `???`. 462 | 463 | However, it is possible to use a `Set` instead of a `Map` to get a similar effect: 464 | 465 | 466 | ```rust 467 | use std::{ 468 | borrow::Borrow, 469 | collections::HashSet, 470 | hash::{Hash, Hasher}, 471 | }; 472 | 473 | struct Person { 474 | name: String, 475 | age: u32, 476 | } 477 | 478 | impl Borrow for Person { 479 | fn borrow(&self) -> &str { 480 | &self.name 481 | } 482 | } 483 | 484 | impl PartialEq for Person { 485 | fn eq(&self, other: &Person) -> bool { 486 | self.name == other.name 487 | } 488 | } 489 | 490 | impl Eq for Person {} 491 | 492 | impl Hash for Person { 493 | fn hash(&self, hasher: &mut H) { 494 | self.name.hash(hasher) 495 | } 496 | } 497 | 498 | fn persons_by_name(persons: Vec) -> HashSet { 499 | persons.into_iter().collect() 500 | } 501 | 502 | fn get_person_by_name<'p>(persons: &'p HashSet, name: &str) -> Option<&'p Person> { 503 | persons.get(name) 504 | } 505 | ``` 506 | 507 | The crux of the trick here is that Rust sets and maps use the `Borrow` trait. 508 | It allows lookup operations by types different than those stored in the container. 509 | In this example, by implementing `Borrow for Person`, we get the ability to get a `&Person` out of `&HashSet` by `&str`. 510 | 511 | Note that, because we implement `Borrow`, we **must** override `Eq` and `Hash` to be consistent with it. 512 | That is, we compare persons by looking only at the name, and ignore the age. 513 | This is likely not what you want for the rest of your application, so you might need to additionally create an `struct PersonByName(Person)` wrapper type and implement `Borrow` for that. 514 | 515 | Sources: the original pattern overheard from [@elizarov](https://github.com/elizarov) in the context of Kotlin, Rust's stdlib docs, [recent conversation](https://www.reddit.com/r/rust/comments/d1ys9b/hashmaplike_data_structure_where_the_keys_are/) on r/rust. 516 | --------------------------------------------------------------------------------