├── .gitignore ├── Cargo.lock ├── Cargo.toml ├── LICENSE.md ├── README.md └── src └── main.rs /.gitignore: -------------------------------------------------------------------------------- 1 | /target 2 | **/*.rs.bk 3 | .idea 4 | -------------------------------------------------------------------------------- /Cargo.lock: -------------------------------------------------------------------------------- 1 | # This file is automatically @generated by Cargo. 2 | # It is not intended for manual editing. 3 | [[package]] 4 | name = "green_threads" 5 | version = "0.1.0" 6 | 7 | -------------------------------------------------------------------------------- /Cargo.toml: -------------------------------------------------------------------------------- 1 | [package] 2 | name = "green_threads" 3 | version = "0.1.0" 4 | authors = ["Carl Fredrik Samson "] 5 | edition = "2018" 6 | 7 | # See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html 8 | 9 | [dependencies] 10 | -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- 1 | Copyright 2019 Carl Fredrik Samson 2 | 3 | Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: 4 | 5 | The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. 6 | 7 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Green Threads Example 2 | 3 | This repo was made to accompany article and gitbook. 4 | 5 | Gitbook: [https://cfsamson.gitbook.io/green-threads-explained-in-200-lines-of-rust/](https://cfsamson.gitbook.io/green-threads-explained-in-200-lines-of-rust/) 6 | 7 | ## Branches 8 | There are a few interesting branches: 9 | 1. `master` - this will be the 200 lines of code in the book 10 | 2. `commented` - this is the same 200 lines but extensively commented along the way 11 | 3. `windows` - this is an implementation with a proper context switch on Windows, also a copy of the code in the book 12 | 4. `trait_objects` - this is an implementation where we can take trait objects like `Fn()`, `FnMut()` and `FnOnce()` instead of just function pointers, this is way more useful but currently lags a bit behind the improvements in the first three branches 13 | 5. `futures` - I'm collecting data and playing around to tie this in to Rusts Futures and async story - see below 14 | 15 | ## Futures 16 | The end goal was (and still is) for me to use this as a basis to investigate and implement a simple example of the Executor-Reactor pattern using 17 | Futures 3.0 and Rusts async/await syntax. 18 | 19 | The main idea is that Reading these two books will give most people a pretty deep understanding of async code and how Futures work once finished and 20 | thereby bride a gap between the documentation that is certainly going to come from the Rust docs team and in libraries like `Tokio` and use an example driven 21 | approach to learning the basics pretty much from the ground up. I will not be focusing too much on how exactly `tokio`, `romio` or `mio` works since they are good 22 | implementations that carry a large amount of complexity in themselves. 23 | 24 | The threading implementation used in the first book will probably be changed slightly to serve as an `exeutor` and instead of just spawning 25 | `fn()` or `trait objects` we spawn futures. 26 | 27 | ## Update 2019-07-31 28 | I decided to divide this deep dive into asynchronous code into three books. 29 | 30 | 1. Green Threads 31 | 32 | Learn about the way of doing multitasking that GO, Ruby and many other programs use. It also explains a lot about some OS basics since threads and context switching is an important part of how operating systems manage to multitask. We explore this by creating our own green threads implementation. 33 | 34 | 2. Learn Async programming 35 | 36 | This book is about two other ways of performing async execution. Threadpools and epoll/kqueue/IOCP - we explore this by implementing a very simplified version of Nodes event loop. It's an interesting example since it uses both ways of running code asyncrhonously. This book will in some sort be higher level than green threads, but will talk much more about working together with the operating system to run async code effectively. 37 | 38 | I'm working on the book right now so if you want to follow along and/or give feedback you can have a look here: https://cfsamson-1.gitbook.io/async-basics-explained-with-rust/ 39 | 40 | 3. Futures in Rust 41 | 42 | Ok, so the previous two books set up all we needed to know of basic information to get a good understanding of Rusts futures. I'll try to explain them by implementing a very simplified version of the whole stack: libc, mio (reactor) and an executor runtime (tokio). We'll talk about the design of futures, how they work and why they work the way they do. Since we covered so much in the previous two books, this one will be more Rust Centric and focused on Futures since we already have most of the Async basics we need covered. Work on this book has not really started yet. 43 | 44 | ## Changelog 45 | **2019-22-12:** Added one line of code to make sure the memory we get from the allocator is 16 byte aligned. Refactored to use the "high" memory 46 | address as basis for offsets when writing to the stack since this made alignment easier. See Issue #12 for more information. 47 | 48 | **2019-06-26:** The Supporting Windows appendix treated the XMM fields as 64 bits, but they are 128 bits which was an oversight on my part. Correcting this added some interesting material to that chapter but unfortunately also some complexity. However, it's now corrected and explained in both the book and repo. It now only slightly deviates from the [Boost Context library implementation](https://github.com/boostorg/context/blob/develop/src/asm/ontop_x86_64_ms_pe_gas.asm) which I consider one of the better implementations out there. 49 | 50 | **2019-06-21:** Rather substantial change and cleanup. An issue was reported that Valgrind reported some troubles with the code and crashed. This is now fixed and there are currently no unsolved issues. In addition, the code now runs on both debugand releasebuilds without any issues on all platforms. Thanks to everyone for reporting issues they found. 51 | 52 | **2019-06-18:** New chapter implementing a proper Windows support 53 | -------------------------------------------------------------------------------- /src/main.rs: -------------------------------------------------------------------------------- 1 | #![feature(llvm_asm)] 2 | #![feature(naked_functions)] 3 | use std::ptr; 4 | 5 | // In our simple example we set most constraints here. 6 | const DEFAULT_STACK_SIZE: usize = 1024 * 1024 * 2; 7 | const MAX_TASKS: usize = 4; 8 | static mut RUNTIME: usize = 0; 9 | 10 | pub struct Runtime { 11 | tasks: Vec, 12 | current: usize, 13 | } 14 | 15 | #[derive(PartialEq, Eq, Debug)] 16 | enum State { 17 | Available, 18 | Running, 19 | Ready, 20 | } 21 | 22 | struct Task { 23 | id: usize, 24 | stack: Vec, 25 | ctx: TaskContext, 26 | state: State, 27 | } 28 | 29 | #[derive(Debug, Default)] 30 | #[repr(C)] // not strictly needed but Rust ABI is not guaranteed to be stable 31 | struct TaskContext { 32 | // 15 u64 33 | x1: u64, //ra: return addres 34 | x2: u64, //sp 35 | x8: u64, //s0,fp 36 | x9: u64, //s1 37 | x18: u64, //x18-27: s2-11 38 | x19: u64, 39 | x20: u64, 40 | x21: u64, 41 | x22: u64, 42 | x23: u64, 43 | x24: u64, 44 | x25: u64, 45 | x26: u64, 46 | x27: u64, 47 | nx1: u64, //new return addres 48 | } 49 | 50 | impl Task { 51 | fn new(id: usize) -> Self { 52 | // We initialize each task here and allocate the stack. This is not neccesary, 53 | // we can allocate memory for it later, but it keeps complexity down and lets us focus on more interesting parts 54 | // to do it here. The important part is that once allocated it MUST NOT move in memory. 55 | Task { 56 | id, 57 | stack: vec![0_u8; DEFAULT_STACK_SIZE], 58 | ctx: TaskContext::default(), 59 | state: State::Available, 60 | } 61 | } 62 | } 63 | 64 | impl Runtime { 65 | pub fn new() -> Self { 66 | // This will be our base task, which will be initialized in the `running` state 67 | let base_task = Task { 68 | id: 0, 69 | stack: vec![0_u8; DEFAULT_STACK_SIZE], 70 | ctx: TaskContext::default(), 71 | state: State::Running, 72 | }; 73 | 74 | // We initialize the rest of our tasks. 75 | let mut tasks = vec![base_task]; 76 | let mut available_tasks: Vec = (1..MAX_TASKS).map(|i| Task::new(i)).collect(); 77 | tasks.append(&mut available_tasks); 78 | 79 | Runtime { 80 | tasks, 81 | current: 0, 82 | } 83 | } 84 | 85 | /// This is cheating a bit, but we need a pointer to our Runtime stored so we can call yield on it even if 86 | /// we don't have a reference to it. 87 | pub fn init(&self) { 88 | unsafe { 89 | let r_ptr: *const Runtime = self; 90 | RUNTIME = r_ptr as usize; 91 | } 92 | } 93 | 94 | /// This is where we start running our runtime. If it is our base task, we call yield until 95 | /// it returns false (which means that there are no tasks scheduled) and we are done. 96 | pub fn run(&mut self) -> ! { 97 | while self.t_yield() {} 98 | std::process::exit(0); 99 | } 100 | 101 | /// This is our return function. The only place we use this is in our `guard` function. 102 | /// If the current task is not our base task we set its state to Available. It means 103 | /// we're finished with it. Then we yield which will schedule a new task to be run. 104 | fn t_return(&mut self) { 105 | if self.current != 0 { 106 | self.tasks[self.current].state = State::Available; 107 | self.t_yield(); 108 | } 109 | } 110 | 111 | /// This is the heart of our runtime. Here we go through all tasks and see if anyone is in the `Ready` state. 112 | /// If no task is `Ready` we're all done. This is an extremely simple sceduler using only a round-robin algorithm. 113 | /// 114 | /// If we find a task that's ready to be run we change the state of the current task from `Running` to `Ready`. 115 | /// Then we call switch which will save the current context (the old context) and load the new context 116 | /// into the CPU which then resumes based on the context it was just passed. 117 | fn t_yield(&mut self) -> bool { 118 | let mut pos = self.current; 119 | while self.tasks[pos].state != State::Ready { 120 | pos += 1; 121 | if pos == self.tasks.len() { 122 | pos = 0; 123 | } 124 | if pos == self.current { 125 | return false; 126 | } 127 | } 128 | 129 | if self.tasks[self.current].state != State::Available { 130 | self.tasks[self.current].state = State::Ready; 131 | } 132 | 133 | self.tasks[pos].state = State::Running; 134 | let old_pos = self.current; 135 | self.current = pos; 136 | 137 | unsafe { 138 | switch(&mut self.tasks[old_pos].ctx, &self.tasks[pos].ctx); 139 | } 140 | 141 | // NOTE: this might look strange and it is. Normally we would just mark this as `unreachable!()` but our compiler 142 | // is too smart for it's own good so it optimized our code away on release builds. Curiously this happens on windows 143 | // and not on linux. This is a common problem in tests so Rust has a `black_box` function in the `test` crate that 144 | // will "pretend" to use a value we give it to prevent the compiler from eliminating code. I'll just do this instead, 145 | // this code will never be run anyways and if it did it would always be `true`. 146 | self.tasks.len() > 0 147 | } 148 | 149 | /// While `yield` is the logically interesting function I think this the technically most interesting. 150 | /// 151 | /// When we spawn a new task we first check if there are any available tasks (tasks in `Parked` state). 152 | /// If we run out of tasks we panic in this scenario but there are several (better) ways to handle that. 153 | /// We keep things simple for now. 154 | /// 155 | /// When we find an available task we get the stack length and a pointer to our u8 bytearray. 156 | /// 157 | /// The next part we have to use some unsafe functions. First we write an address to our `guard` function 158 | /// that will be called if the function we provide returns. Then we set the address to the function we 159 | /// pass inn. 160 | /// 161 | /// Third, we set the value of `sp` which is the stack pointer to the address of our provided function so we start 162 | /// executing that first when we are scheuled to run. 163 | /// 164 | /// Lastly we set the state as `Ready` which means we have work to do and is ready to do it. 165 | pub fn spawn(&mut self, f: fn()) { 166 | let available = self 167 | .tasks 168 | .iter_mut() 169 | .find(|t| t.state == State::Available) 170 | .expect("no available task."); 171 | 172 | let size = available.stack.len(); 173 | unsafe { 174 | let s_ptr = available.stack.as_mut_ptr().offset(size as isize); 175 | 176 | // make sure our stack itself is 8 byte aligned - it will always 177 | // offset to a lower memory address. Since we know we're at the "high" 178 | // memory address of our allocated space, we know that offsetting to 179 | // a lower one will be a valid address (given that we actually allocated) 180 | // enough space to actually get an aligned pointer in the first place). 181 | let s_ptr = (s_ptr as usize & !7) as *mut u8; 182 | 183 | available.ctx.x1 = guard as u64; //ctx.x1 is old return address 184 | available.ctx.nx1 = f as u64; //ctx.nx2 is new return address 185 | available.ctx.x2 = s_ptr.offset(-32) as u64; //cxt.x2 is sp 186 | 187 | } 188 | available.state = State::Ready; 189 | } 190 | } 191 | 192 | /// This is our guard function that we place on top of the stack. All this function does is set the 193 | /// state of our current task and then `yield` which will then schedule a new task to be run. 194 | fn guard() { 195 | unsafe { 196 | let rt_ptr = RUNTIME as *mut Runtime; 197 | (*rt_ptr).t_return(); 198 | }; 199 | } 200 | 201 | /// We know that Runtime is alive the length of the program and that we only access from one core 202 | /// (so no datarace). We yield execution of the current task by dereferencing a pointer to our 203 | /// Runtime and then calling `t_yield` 204 | pub fn yield_task() { 205 | unsafe { 206 | let rt_ptr = RUNTIME as *mut Runtime; 207 | (*rt_ptr).t_yield(); 208 | }; 209 | } 210 | 211 | /// So here is our inline Assembly. As you remember from our first example this is just a bit more elaborate where we first 212 | /// read out the values of all the registers we need and then sets all the register values to the register values we 213 | /// saved when we suspended exceution on the "new" task. 214 | /// 215 | /// This is essentially all we need to do to save and resume execution. 216 | /// 217 | /// Some details about inline assembly. 218 | /// 219 | /// The assembly commands in the string literal is called the assemblt template. It is preceeded by 220 | /// zero or up to four segments indicated by ":": 221 | /// 222 | /// - First ":" we have our output parameters, this parameters that this function will return. 223 | /// - Second ":" we have the input parameters which is our contexts. We only read from the "new" context 224 | /// but we modify the "old" context saving our registers there (see volatile option below) 225 | /// - Third ":" This our clobber list, this is information to the compiler that these registers can't be used freely 226 | /// - Fourth ":" This is options we can pass inn, Rust has 3: "alignstack", "volatile" and "intel" 227 | /// 228 | /// For this to work on windows we need to use "alignstack" where the compiler adds the neccesary padding to 229 | /// make sure our stack is aligned. Since we modify one of our inputs, our assembly has "side effects" 230 | /// therefore we should use the `volatile` option. I **think** this is actually set for us by default 231 | /// when there are no output parameters given (my own assumption after going through the source code) 232 | /// for the `asm` macro, but we should make it explicit anyway. 233 | /// 234 | /// One last important part (it will not work without this) is the #[naked] attribute. Basically this lets us have full 235 | /// control over the stack layout since normal functions has a prologue-and epilogue added by the 236 | /// compiler that will cause trouble for us. We avoid this by marking the funtion as "Naked". 237 | /// For this to work on `release` builds we also need to use the `#[inline(never)] attribute or else 238 | /// the compiler decides to inline this function (curiously this currently only happens on Windows). 239 | /// If the function is inlined we get a curious runtime error where it fails when switching back 240 | /// to as saved context and in general our assembly will not work as expected. 241 | /// 242 | /// see: https://github.com/rust-lang/rfcs/blob/master/text/1201-naked-fns.md 243 | #[naked] 244 | #[inline(never)] 245 | unsafe fn switch(old: *mut TaskContext, new: *const TaskContext) { 246 | // a0: old, a1: new 247 | llvm_asm!(" 248 | sd x1, 0x00(a0) 249 | sd x2, 0x08(a0) 250 | sd x8, 0x10(a0) 251 | sd x9, 0x18(a0) 252 | sd x18, 0x20(a0) 253 | sd x19, 0x28(a0) 254 | sd x20, 0x30(a0) 255 | sd x21, 0x38(a0) 256 | sd x22, 0x40(a0) 257 | sd x23, 0x48(a0) 258 | sd x24, 0x50(a0) 259 | sd x25, 0x58(a0) 260 | sd x26, 0x60(a0) 261 | sd x27, 0x68(a0) 262 | sd x1, 0x70(a0) 263 | 264 | ld x1, 0x00(a1) 265 | ld x2, 0x08(a1) 266 | ld x8, 0x10(a1) 267 | ld x9, 0x18(a1) 268 | ld x18, 0x20(a1) 269 | ld x19, 0x28(a1) 270 | ld x20, 0x30(a1) 271 | ld x21, 0x38(a1) 272 | ld x22, 0x40(a1) 273 | ld x23, 0x48(a1) 274 | ld x24, 0x50(a1) 275 | ld x25, 0x58(a1) 276 | ld x26, 0x60(a1) 277 | ld x27, 0x68(a1) 278 | ld t0, 0x70(a1) 279 | 280 | jr t0 281 | " 282 | : : : : "volatile", "alignstack" 283 | ); 284 | } 285 | 286 | fn main() { 287 | let mut runtime = Runtime::new(); 288 | runtime.init(); 289 | runtime.spawn(|| { 290 | println!("TASK 1 STARTING"); 291 | let id = 1; 292 | for i in 0..10 { 293 | println!("task: {} counter: {}", id, i); 294 | yield_task(); 295 | } 296 | println!("TASK 1 FINISHED"); 297 | }); 298 | runtime.spawn(|| { 299 | println!("TASK 2 STARTING"); 300 | let id = 2; 301 | for i in 0..15 { 302 | println!("task: {} counter: {}", id, i); 303 | yield_task(); 304 | } 305 | println!("TASK 2 FINISHED"); 306 | }); 307 | runtime.run(); 308 | } --------------------------------------------------------------------------------