├── .gitignore
├── .gitmodules
├── .travis.yml
├── LICENSE-CC-BY-4.0
├── README.md
├── config.toml
├── content
├── brave-new-io.md
├── cpu-monitor.md
├── embedded-rust-in-2018.md
├── fearless-concurrency.md
├── hello-world.md
├── itm.md
├── microamp.md
├── multicore-rtfm.md
├── quickstart.md
├── rtfm-overhead.md
├── rtfm-v2.md
├── rtfm-v3.md
├── rtfm-v4.md
├── safe-dma.md
├── stack-analysis-2.md
├── stack-analysis.md
├── stack-overflow-protection.md
├── wd-1-2-l3gd20-lsm303dlhc-madgwick.md
└── wd-4-enc28j60.md
├── deploy.sh
└── static
├── CNAME
├── brave-new-io
├── cortex-m.webm
└── rpi.webm
├── cpu-monitor
├── blinky.svg
└── loopback.webm
├── fearless-concurrency
├── concurrency.svg
├── control.webm
├── loopback.svg
├── minicom.png
├── periodic.svg
├── preemption.svg
└── roulette.webm
├── itm
├── blue-pill-itm.jpg
├── blue-pill-serial.jpg
├── blue-pill.jpg
└── f3.jpg
├── logo
├── formation.png
├── memfault.svg
├── pollen-robotics.png
└── sharebrained.png
├── quickstart
├── blink.webm
├── exception-handler.png
├── f3.jpg
├── gdb.png
├── main.png
├── memory.svg
├── monotask.svg
└── multitask.svg
├── robots.txt
├── rtfm-overhead
├── tasks.svg
└── threads.svg
├── stack-analysis-2
├── drop.svg
├── dyn-wrong.svg
├── dyn.svg
├── fn-wrong.svg
└── fn.svg
├── stack-analysis
├── 6lowpan.svg
├── cycle.svg
├── direct.svg
├── fib.svg
├── fmt.svg
├── fmul.svg
├── fn.svg
├── lossy-fixed.svg
├── lossy.svg
├── memclr.svg
├── nop.svg
├── select-fixed.svg
├── select.svg
└── to.svg
├── stack-overflow-protection
├── heap.svg
├── overflow.svg
├── swapped-heap.svg
└── swapped.svg
├── wd-1-2-l3gd20-lsm303dlhc-madgwick
├── accel.svg
├── eights.webm
├── gyro-calibrated.svg
├── gyro.svg
├── l3gd20.svg
├── lsm303dlhc-accel.svg
├── lsm303dlhc-mag.svg
├── mag-calibrated.svg
├── mag.svg
└── viz.webm
└── wd-4-enc28j60
├── coap.webm
└── enc28j60.jpg
/.gitignore:
--------------------------------------------------------------------------------
1 | *.org
2 | public
3 |
--------------------------------------------------------------------------------
/.gitmodules:
--------------------------------------------------------------------------------
1 | [submodule "themes/hucore"]
2 | path = themes/hucore
3 | url = https://github.com/japaric/hucore
4 |
--------------------------------------------------------------------------------
/.travis.yml:
--------------------------------------------------------------------------------
1 | script:
2 | - bash deploy.sh
3 |
4 | branches:
5 | only:
6 | - master
7 |
8 | notifications:
9 | email:
10 | on_success: never
11 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # `embedded-in-rust`
2 |
3 | This is the "source code" of my blog.
4 |
5 | ## Building
6 |
7 | You can spin up a local instance of my blog using [hugo].
8 |
9 | [hugo]: https://gohugo.io
10 |
11 | ```
12 | $ hugo server
13 | ```
14 |
15 | ## License
16 |
17 | Licensed under the
18 |
19 | - Creative Commons Attribution 4.0 License
20 | ([LICENSE-CC-BY-4.0](LICENSE-CC-BY-4.0) or
21 | https://creativecommons.org/licenses/by/4.0/legalcode.txt)
22 |
23 | ### Contribution
24 |
25 | Unless you explicitly state otherwise, any contribution intentionally submitted
26 | for inclusion in the work by you shall be licensed as above, without any
27 | additional terms or conditions.
28 |
--------------------------------------------------------------------------------
/config.toml:
--------------------------------------------------------------------------------
1 | baseURL = "https://blog.japaric.io/"
2 | copyright = " Jorge Aparicio"
3 | enableEmoji = true
4 | languageCode = "en-us"
5 | theme = "hucore"
6 | title = "Embedded in Rust"
7 |
8 | [taxonomies]
9 | tag = "tags"
10 |
11 | [params]
12 | author = "Jorge Aparicio"
13 | description = "A blog about Rust and embedded stuff"
14 | displayauthor = true
15 | keywords = ["Rust", "embedded", "development"]
16 | sharingicons = false
17 |
18 | [params.highlight]
19 | languages = ["armasm", "c", "diff", "llvm", "rust", "shell"]
20 | style = "github"
21 | theme = "tomorrow-night"
22 |
23 | [[params.social]]
24 | fa_icon = "fa-github"
25 | url = "https://github.com/japaric"
26 |
27 | [[params.social]]
28 | fa_icon = "fa-twitter"
29 | url = "https://twitter.com/japaric_io"
30 |
31 | [[params.social]]
32 | fa_icon = "fa-rss"
33 | url = "/index.xml"
34 |
--------------------------------------------------------------------------------
/content/embedded-rust-in-2018.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Embedded Rust in 2018"
3 | date: 2018-01-21T22:10:38+01:00
4 | draft: false
5 | ---
6 |
7 | This is my [#Rust2018] blog post.
8 |
9 | [#Rust2018]: https://blog.rust-lang.org/2018/01/03/new-years-rust-a-call-for-community-blogposts.html
10 |
11 | These are some things I think the Rust team needs to address this year to make Rust a (more) viable
12 | alternative to C/C++ in the area of bare metal (i.e. `no_std`) embedded applications.
13 |
14 | # Stability
15 |
16 | Here's a list of breakage / regressions *I* encountered (i.e. that I had to work around / fix)
17 | during 2017:
18 |
19 | - Changes in target specification files broke compilation of no_std projects that use custom
20 | targets. Happened once or twice this year (it has happened in 2016 too); don't recall the exact
21 | number.
22 |
23 | - Adding column information to panic messages, which changed the signature of `panic_fmt`, bloated
24 | binary size by 200-600%.
25 |
26 | - ThinLTO, which became enabled by default, broke linking in release mode.
27 |
28 | - Parallel codegen, which became enabled by default, broke linking in dev mode.
29 |
30 | - Incremental compilation, which became enabled by default, broke linking in dev mode. Or maybe it
31 | was the `Termination` trait stuff. Neither is the direct cause but either change made [an old bug]
32 | resurface. This is still [unfixed] and disabling both incremental compilation and parallel codegen
33 | is the best way to avoid the problem.
34 |
35 | [an old bug]: https://github.com/rust-lang/rust/issues/18807
36 | [unfixed]: https://github.com/rust-lang/rust/issues/47074
37 |
38 | - The `Termination` trait broke one of the core crates of the Cortex-M ecosystem (and every other
39 | user of the `start` lang item).
40 |
41 | - A routine dependency update (`cargo update`) in rust-lang/rust broke one of Xargo use cases.
42 | Fixing the issue in Xargo broke another use case. Finally, undoing the fix a few days later fixed
43 | both use cases.
44 |
45 | - A change in libcore broke compilation of it for ARMv6-M and MSP430, and probably other custom
46 | targets. This happened twice.
47 |
48 | - I recall some breakage related to compiler-builtins but don't remember the details.
49 |
50 | Note that only *two* of these are actually related to *feature gated* language features (`start` and
51 | `panic_fmt`). Target specification files are not feature gated even though they are considered
52 | unstable by the Rust team.
53 |
54 | Ideally, this list should be empty this year. [As others have expressed][thejpster] it's
55 | demotivating to come back to a project after a while and see that it no longer builds. And this
56 | instability can be exhausting for library crate authors / maintainers, let me explain:
57 |
58 | [thejpster]: http://railwayelectronics.blogspot.se/2018/01/i-recently-picked-up-embedded-project.html
59 |
60 | If a library crate has 10 users those users can potentially use up to 10 *different* nightly
61 | versions at any point in time. The bigger this nightly spread the higher the chance of (a) users
62 | reporting issues, which usually are rustc issues or language level breaking changes, that
63 | occur on nightlies newer than the one the crate author tested, and of (b) users reporting already
64 | fixed issues that occur on nightlies older than the one the author tested.
65 |
66 | I've seen some people suggest pinning crates to some specific nightly version using a
67 | `rust-toolchain` file as a solution to the stability problem. That may work for projects centered
68 | around binary crates like Servo and for projects that use monorepos like Tock but it doesn't work
69 | for library crates because the `rust-toolchain` files of dependencies are ignored.
70 |
71 | Library authors could enforce their crates to only build for a certain range of nightlies by
72 | checking the compiler version in their crate build script but that makes them less composable: a
73 | downstream user may not be able to use your crate if they are also using some other crate that
74 | restricts its use to a range of nightlies incompatible with your crate's restrictions. There are
75 | other issues as well: I actually tried this approach and [broke] the docs.rs build of my
76 | `cortex-m-rt` crate *and* the docs.rs builds of all the reverse dependencies of my crate.
77 |
78 | [broke]: https://docs.rs/crate/f3/0.5.0/builds/82700
79 |
80 | ## Establishing a first line of defense
81 |
82 | Around half of the issues in my 2017 list were eventually fixed in rustc or in the std facade and
83 | required no modification of user code. These issues could have been spotted and fixed by Rust
84 | developers *before* they landed if the Rust test system incorporated building some embedded crates
85 | as one of its tests.
86 |
87 | Of course, compiler development should not be halted because some crate stops compiling due to a
88 | breaking change in an unstable feature. In those cases, the result of building *that* crate should
89 | explicitly marked as "ignore" to let the PR land.
90 |
91 | Being able to ignore a failed build seems to defeat the purpose but even in that scenario this
92 | system serves as a way to notify the crate author about the upcoming breakage; that way they can
93 | start taking measures before the PR lands.
94 |
95 | There's a mechanism for temporarily ignoring some parts of a CI build already in rust-lang/rust
96 | (it's used to test the RLS, clippy, etc.) that could be could be used for this purpose.
97 |
98 | ## Stabilization in baby steps
99 |
100 | The ultimate solution to the instability problem is to make embedded development possible on stable.
101 | Unfortunately, that's unlikely to be accomplished in a single year: the number of unstable features
102 | used in embedded development is not only long but also includes the hardest ones to stabilize:
103 | language items, features for low level control of symbols, features tightly coupled to the backend,
104 | etc.
105 |
106 | Still, that doesn't mind we shouldn't make some progress this year. I think we can attack
107 | stabilization from two fronts: (a) get embedded no-std libraries working on stable, and (b) get a
108 | *minimal* no-std binary working on stable.
109 |
110 | The feature list for (a) is not that long and it probably overlaps with the needs of non embedded
111 | developers. The list contains:
112 |
113 | - Xargo.
114 | - `const fn`
115 | - `asm!`
116 |
117 | There may be more features but those are the most common.
118 |
119 | The feature list for (b) in short:
120 |
121 | - Xargo
122 | - `panic_fmt`
123 |
124 | That should be enough for applications where the boot sequence and compiler intrinsics are written
125 | in C (e.g. when you link to newlibc, a libc for embedded systems). If you want to do everything in
126 | Rust while providing the functionality you would get from newlib then the list becomes much longer:
127 |
128 | - The `compiler_builtins` library
129 | - `#[start]` entry point
130 | - `#[used]`
131 | - `Termination` trait (this wasn't in last year list ....)
132 | - `#[linkage = "weak"]`
133 |
134 | But I think it makes sense to start with the short version first.
135 |
136 | How can we tackle the most pressing unstable features?
137 |
138 | ### Xargo
139 |
140 | Xargo only works on nightly so if you it need for development you are stuck with nightly. The
141 | general fix is to land [Xargo functionality in Cargo] and then stabilize it. But a more targeted and
142 | faster fix would be to make a `rust-core` component available for some embedded targets,
143 | `thumbv7m-none-eabi` for example.
144 |
145 | The Cargo team has [expressed] their intention on working on the general fix this year so we should
146 | see some progress.
147 |
148 | [Xargo functionality in Cargo]: https://github.com/rust-lang/cargo/issues/4959
149 | [expressed]: https://github.com/japaric/xargo/issues/193#issuecomment-359180429
150 |
151 | ### `const fn`
152 |
153 | I know the plan is to swap the current const evaluator with [miri] to make const evaluation more
154 | powerful. Personally, I wouldn't want that improvement to *delay* stabilization of the `const fn`
155 | feature. Even in its current state, where it can only evaluate expression and other calls to const
156 | fn, `const fn` is already very useful and widely used. I'd like to see the current, limited form
157 | stabilized sometime this year and the miri version behind a feature gate.
158 |
159 | [miri]: https://github.com/solson/miri
160 |
161 |
162 | ### `asm!`
163 |
164 | I saw someone posted an `asm!` like macro that works on stable by compiling external assembly files
165 | and using FFI to call into them. Unfortunately, that solution is not appropriate for this
166 | application space, for several reasons:
167 |
168 | - These assembly invocations can't be inlined (FFI works at the symbol level) so they will always
169 | have a function call indirection. `no_std` embedded applications are both performance and binary
170 | size sensitive; the indirection would put us behind C / C++ in both aspects.
171 |
172 | - The function call indirection also makes impossible to have safe wrappers around things like "read
173 | the Program Counter", or "read the Link Register". It also reduces the effectiveness of
174 | breakpoint instructions: the debugger ends in the wrong stack frame.
175 |
176 | - You can't do `global_asm!` because of the FFI call. We use `global_asm!` in the ARM Cortex-M space
177 | to implement weak aliasing since the language doesn't have support for it (C does).
178 |
179 | - This adds a dependency on an external assembler or, worst, a C compiler (the implementation used a
180 | C compiler last time I checked). I would consider that a tooling regression. Today, building ARM
181 | Cortex-M applications only requires an external linker and we use `ld`, not `gcc`. LLD also
182 | works as a linker and as soon as LLD lands in rustc Cortex-M builds won't require *any external
183 | tool*.
184 |
185 | Bottom line: we need proper inline assembly to be stabilized. And, yes, I know it's hard; which is
186 | why I don't have any suggestion here :-).
187 |
188 | ### `panic_fmt`
189 |
190 | I wrote an [RFC] for adding a stable mechanism to specify panicking behavior in `no_std`
191 | applications that would remove the need for the `panic_fmt` lang item. The RFC has been accepted but
192 | it has not been implemented yet. If you are looking for ways to help solve the instability problem
193 | [implementing that RFC][rfc-impl] would be a great contribution!
194 |
195 | [RFC]: https://github.com/rust-lang/rfcs/blob/master/text/2070-panic-implementation.md
196 | [rfc-impl]: https://github.com/rust-lang/rust/issues/44489
197 |
198 | # The `no_std` / `std` gap
199 |
200 | Only a small fragment of crates.io ecosystem is `no_std` compatible but there are several crates
201 | in the `std`-only category that could become `no_std` compatible:
202 |
203 | - Some `std`-only crates can become `no_std` compatible simply by adding `#![no_std]` to the source
204 | code. Many times this wasn't done from the beginning because the author wasn't aware it was possible
205 | or because `#![no_std]` wasn't a priority for them.
206 |
207 | - Some `std`-only crates only depend on re-exported things that are defined in the `core` and
208 | `collections` crates. These could become `no_std` compatible by adding a `"std"` Cargo feature,
209 | `#[cfg(not(std))] extern collections`, and a few other `#[cfg]` statements here and there.
210 |
211 | - Some `std`-only crates depend on abstractions, like `CStr` and `HashMap`, that are defined in
212 | `std` but that don't depend on OS abstractions like threads, sockets, etc.. This situation has led
213 | `no_std` developers to *fork* these `std` abstractions to make them `no_std` compatible (cf.
214 | [`cstr_core`] and [`hashmap_core`]) with the goal of making these crates.io crates `no_std`
215 | compatible.
216 |
217 | [`cstr_core`]: https://crates.io/crates/cstr_core
218 | [`hashmap_core`]: https://crates.io/crates/hashmap_core
219 |
220 | Making a crate `no_std` compatible needs to become simpler to avoid the scenario where people prefer
221 | to create a *new* `no_std` compatible crate instead of making the ones already published `no_std`
222 | compatible.
223 |
224 | I don't have good suggestions here. Perhaps the first scenario could be improved with some `rustc` /
225 | clippy lint that points out that the crate can be marked as `no_std` compatible. The second and
226 | third scenarios *might* be addressed by the portable lint stuff, but I'm not familiar with that
227 | feature.
228 |
229 | UPDATE(2018-01-22) I think [this comment] by /u/Zoxc32 would be a great solution to the last two
230 | scenarios.
231 |
232 | [this comment]: https://www.reddit.com/r/rust/comments/7s0m6f/eir_embedded_rust_in_2018/dt1f5r2/
233 |
234 | # Better IDE support
235 |
236 | Another thing that C embedded developers are used to work with are IDEs with integrated embedded
237 | tooling: register views, tracing and profiling. Of course, I'm not going to ask the Rust team to
238 | implement embedded tooling but improvements to the RLS improve the IDE experience for everyone so
239 | those are very welcome.
240 |
241 | ## Code completion
242 |
243 | I'm personally really looking forward to *awesome* code completion support in the RLS. Recently I've
244 | been writing some crates using [svd2rust] generated APIs and I'm afraid to admit that I had to
245 | *disable* auto completion because it was slowing down my coding with delays of around one second and
246 | because it didn't provide assistance where I needed it (it didn't suggest methods). `svd2rust`
247 | generated crates are huge though; they usually contain thousands of structs, each one with a handful
248 | of methods. I hope RLS powered code completion will be able to handle them!
249 |
250 | [svd2rust]: https://crates.io/crates/svd2rust
251 |
252 | # Language features
253 |
254 | In embedded programs we tend to use a bunch of `static` variables. There are still some limitations
255 | around `static` variables but some planned features would solve them. I'm personally looking forward
256 | to these features:
257 |
258 | ## `impl Trait` everywhere
259 |
260 | As I mentioned in my [previous] blog post we want to write generic async drivers but to do that we
261 | need traits whose methods return generators and that doesn't work right now so we are blocked on
262 | that front.
263 |
264 | [previous]: /brave-new-io
265 |
266 | ``` rust
267 | trait Write {
268 | fn write_all(
269 | self,
270 | buffer: B,
271 | ) -> impl Generator where ..;
272 | // `-> Box<..>` would work but don't want to depend on a memory allocator
273 | }
274 | ```
275 |
276 | There's also a use case for storing generators in `static` variables. That could potentially let us
277 | write reactive code (code that gets dispatched in interrupt handlers) in a more natural way
278 | ("straight line" code). Today, that reactive style requires hand writing state machines.
279 |
280 | ``` rust
281 | // Some DSL (macro) could expand to something like this
282 |
283 | static mut GN: Option> = None;
284 |
285 | fn interrupt_handler() {
286 | // do some magic with `GN`
287 | }
288 | ```
289 |
290 | ## Const generics
291 |
292 | Often we need collections like [`Vec`]s and [queues] with fixed, known at compile time, capacities.
293 | Those collections internally use arrays as buffers and need to have their capacity (the array size)
294 | parametrized in their types. The problem is that the capacity is a number not a type.
295 |
296 | [`Vec`]: https://docs.rs/heapless/0.2.1/heapless/struct.Vec.html
297 | [queues]: https://docs.rs/heapless/0.2.1/heapless/ring_buffer/struct.RingBuffer.html
298 |
299 | I tried using `AsRef` and `AsMut` as bounds but they didn't cut it because they are limited to
300 | arrays of 32 elements.
301 |
302 | ```
303 | fn example(xs: &T) where
304 | T: AsRef<[u8]>,
305 | {
306 | // ..
307 | }
308 |
309 | let xs = [0; 33];
310 | example(&xs);
311 | //~^ error: `AsRef<[u8]>` not implemented for `[u8; 33]`
312 | ```
313 |
314 | So I'm currently using the `Unsize` trait and it works for arrays of any size but it's a hack (not
315 | its intended usage) and it makes type signatures weird.
316 |
317 | ``` rust
318 | struct Vec
319 | where
320 | B: Unsize<[T]>,
321 | {
322 | buffer: B, // B is effectively `[T; N]`
323 | /* .. */
324 | }
325 |
326 | impl Vec
327 | where
328 | B: Unsize<[T]>,
329 | {
330 | fn pop(&mut self) -> Option {
331 | // unsize the array
332 | let slice: &mut [T] = &mut self.array;
333 | // ..
334 | }
335 | }
336 |
337 | fn example(xs: &mut Vec) { .. }
338 | // odd ^^^^^^^^
339 | ```
340 |
341 | With const generics we would be able to *directly* parametrize the capacity in the `Vec` type:
342 |
343 | ``` rust
344 | struct Vec {
345 | buffer: [T; N],
346 | /* .. */
347 | }
348 |
349 | fn example(xs: &mut Vec) { .. }
350 | // better ^^
351 | ```
352 |
353 | This one's not a blocker but would be nice to have. We only need the most basic version of const
354 | generics, which has already been accepted, so I'm hoping it gets implemented sooner than latter.
355 |
356 | ---
357 |
358 | That's my wishlist for the Rust team. Let's make 2018 a great year for embedded Rust!
359 |
360 | Let's discuss on [reddit].
361 |
362 | [reddit]: https://www.reddit.com/r/rust/comments/7s0m6f/eir_embedded_rust_in_2018/
363 |
--------------------------------------------------------------------------------
/content/hello-world.md:
--------------------------------------------------------------------------------
1 | +++
2 | author = "Jorge Aparicio"
3 | date = "2017-04-24T08:00:08-05:00"
4 | description = "Introduction post"
5 | draft = false
6 | title = "Hello, world!"
7 | +++
8 |
9 | Hey there! Welcome to my blog, where I'll be writing about Rust and embedded
10 | systems-y stuff -- that's it, mainly about programming ARM Cortex-M
11 | microcontrollers as that's what Rust best supports today [^targets]. But, I'm
12 | interested in anything that has a `#![no_std]` attribute in it [^no_std] so I
13 | may cover some other stuff as well.
14 |
15 | [^targets]: There's in tree support for MSP430 microcontrollers but I own no
16 | MSP430 hardware; and, AVR support is not in tree yet.
17 |
18 | [^no_std]: That includes building your own `std`! So I may write about [Xargo]
19 | and [steed] at some point.
20 |
21 | [Xargo]: https://github.com/japaric/xargo
22 | [steed]: https://github.com/japaric/steed
23 |
24 | That being said, this first post is neither about Rust or embedded stuff as it's
25 | mainly for testing my blogging setup; so, why not write about that instead?
26 | (Otherwise this post will end up being too short)
27 |
28 | # My blogging setup
29 |
30 | This blog is a static website built using [Hugo], a fast static site generator
31 | written in Go. The blog theme is a modified version of the [hucore] theme. The
32 | modifications are the following:
33 |
34 | [Hugo]: https://gohugo.io
35 | [hucore]: https://themes.gohugo.io/hucore/
36 |
37 | - Customizable highlight.js theme.
38 |
39 | I didn't like the default theme and there didn't seem to be any way to change it
40 | so I hacked the theme source code to make the theme customizable. I've picked
41 | the `tomorrow-night` theme for this blog since that's what I use in my terminal.
42 | It looks like this:
43 |
44 | ``` rust
45 | fn main() {
46 | println!("Hello, world!");
47 | }
48 | ```
49 |
50 | - LiveReload support.
51 |
52 | But you won't notice this one as it's a development only feature.
53 |
54 | - Table of Contents
55 |
56 | Which you should see on the right. I stole this one from the [Minos] theme. I
57 | ... hope they don't mind.
58 |
59 | [Minos]: https://themes.gohugo.io/hugo-theme-minos
60 |
61 | As a good citizen of the open source world, I sent PRs to [mgjohansen/hucore]
62 | with some of these modifications.
63 |
64 | [mgjohansen/hucore]: https://github.com/mgjohansen/hucore
65 |
66 | Leaving the theme aside, the "source" of this blog, a bunch of Markdown files,
67 | is hosted on [GitHub]. The site is hosted on [GitHub pages] and I'm [using
68 | Travis] to update the site every time I push to the source repo.
69 |
70 | [GitHub]: https://github.com/japaric/embedded-in-rust
71 | [GitHub pages]: https://pages.github.com/
72 | [using Travis]: https://github.com/japaric/embedded-in-rust/blob/master/.travis.yml
73 |
74 | I think that's enough for this post. See you in the next one! (Oh, you have no
75 | idea what I have in stock :smile:)
76 |
--------------------------------------------------------------------------------
/content/rtfm-v4.md:
--------------------------------------------------------------------------------
1 | +++
2 | author = "Jorge Aparicio"
3 | date = 2018-12-19T18:40:45+01:00
4 | draft = false
5 | tags = ["ARM Cortex-M", "concurrency", "RTFM"]
6 | title = "RTFM v0.4: +stable, software tasks, message passing and a timer queue"
7 | +++
8 |
9 | Hey there! It's been a long time since my last post.
10 |
11 | Today I'm pleased to announce [v0.4.0] of the Real Time for The Masses framework
12 | (AKA RTFM), a concurrency framework for building real time applications.
13 |
14 | [v0.4.0]: https://docs.rs/cortex-m-rtfm/0.4.0/rtfm/
15 |
16 | The greatest new feature, IMO, is that RTFM now works on stable Rust (`1.31+`)!
17 | :tada: :tada: :tada:
18 |
19 | This release also packs quite a few new features which I'll briefly cover in
20 | this post. For a more throughout explanation of RTFM's task model and its
21 | capabilities check out [the RTFM book], which includes examples you can run on
22 | your laptop (yay for emulation), and the [API documentation].
23 |
24 | [the RTFM book]: https://japaric.github.io/cortex-m-rtfm/book/
25 | [API documentation]: https://japaric.github.io/cortex-m-rtfm/api/rtfm/index.html
26 |
27 | # New syntax
28 |
29 | In previous releases you specified tasks and resources using a bang macro:
30 | [`app!`]. This macro has been replaced by a bunch attributes: `#[app]`,
31 | `#[interrupt]`, `#[exception]`, etc.
32 |
33 | [`app!`]: https://github.com/japaric/cortex-m-rtfm/blob/v0.3.4/examples/preemption.rs#L11-L32
34 |
35 | To give you an idea of the new syntax here's one example from the book:
36 |
37 | ``` rust
38 | // examples/interrupt.rs
39 |
40 | #![deny(unsafe_code)]
41 | #![deny(warnings)]
42 | #![no_main]
43 | #![no_std]
44 |
45 | extern crate panic_semihosting;
46 |
47 | use cortex_m_semihosting::{debug, hprintln};
48 | use lm3s6965::Interrupt;
49 | use rtfm::app;
50 |
51 | #[app(device = lm3s6965)]
52 | const APP: () = {
53 | #[init]
54 | fn init() {
55 | // Pends the UART0 interrupt but its handler won't run until *after*
56 | // `init` returns because interrupts are disabled
57 | rtfm::pend(Interrupt::UART0);
58 |
59 | hprintln!("init").unwrap();
60 | }
61 |
62 | #[idle]
63 | fn idle() -> ! {
64 | // interrupts are enabled again; the `UART0` handler runs at this point
65 |
66 | hprintln!("idle").unwrap();
67 |
68 | rtfm::pend(Interrupt::UART0);
69 |
70 | // exit the emulator
71 | debug::exit(debug::EXIT_SUCCESS);
72 |
73 | loop {}
74 | }
75 |
76 | // interrupt handler = hardware task
77 | #[interrupt]
78 | fn UART0() {
79 | static mut TIMES: u32 = 0;
80 |
81 | // Safe access to local `static mut` variable
82 | *TIMES += 1;
83 |
84 | hprintln!(
85 | "UART0 called {} time{}",
86 | *TIMES,
87 | if *TIMES > 1 { "s" } else { "" }
88 | )
89 | .unwrap();
90 | }
91 | };
92 | ```
93 |
94 | ``` console
95 | $ qemu-system-arm (..) interrupt
96 | init
97 | UART0 called 1 time
98 | idle
99 | UART0 called 2 times
100 | ```
101 |
102 | (The `const APP` that's used like a module must look a bit weird to you. I'll
103 | get to it in a minute.)
104 |
105 | The main motivation for this change is to allow composition with other
106 | attributes like the built-in `#[cfg]` attribute, which is used for conditional
107 | compilation, and an hypothetical [`#[ramfunc]`][ramfunc] attribute, which places
108 | functions in RAM.
109 |
110 | [ramfunc]: https://github.com/rust-embedded/cortex-m-rt/pull/100
111 |
112 | ``` rust
113 | // NOTE: assuming some future release of cortex-m-rt
114 | use cortex_m_rt::ramfunc;
115 |
116 | #[rtfm::app(device = lm3s6965)]
117 | const APP: () = {
118 | // ..
119 |
120 | // gotta go fast: run this exception handler (task) from RAM!
121 | #[exception]
122 | #[ramfunc]
123 | fn SysTick() {
124 | // ..
125 | }
126 |
127 | #[cfg(feature = "heartbeat")]
128 | #[interrupt(resources = [LED])]
129 | fn TIMER_0A() {
130 | resources.LED.toggle();
131 | }
132 |
133 | // ..
134 | };
135 | ```
136 |
137 | The other motivation is to let you decentralize the declaration of tasks and
138 | resources. With the old `app!` macro everything had to be declared upfront in a
139 | single place; with attributes you'll be able to declare tasks and resources in
140 | different modules.
141 |
142 | ``` rust
143 | // NOTE: this is NOT a valid rtfm v0.4 application!
144 | #![rtfm::app]
145 |
146 | mod resources {
147 | #[resource]
148 | static mut FOO: u32 = 0;
149 | }
150 |
151 | mod tasks {
152 | #[resource]
153 | static mut BAR: u32 = 0;
154 |
155 | #[interrupt(resources = [crate::resources::FOO, BAR])]
156 | fn UART0() {
157 | // ..
158 | }
159 | }
160 | ```
161 |
162 | However, that's more of a long term goal as it's currently not possible to use
163 | crate level procedural macros or attributes on modules if you are using the
164 | stable channel. The lack of those features on stable is why we are using a
165 | `const` item as a module.
166 |
167 | Finally, it's nice that RTFM applications don't contain any special syntax
168 | (compared to v0.3's `app!`) so now `rustfmt` is able to format the whole
169 | crate.
170 |
171 | # Software tasks
172 |
173 | Until RTFM v0.3, tasks could only be started by an *event* like the user
174 | pressing a button, receiving new data or a software event (see [`rtfm::pend`]).
175 | Also, each of those *hardware* tasks maps to a different interrupt handler
176 | so you can run out of interrupt handlers if you have many tasks.
177 |
178 | [`rtfm::pend`]: https://japaric.github.io/cortex-m-rtfm/api/rtfm/fn.pend.html
179 |
180 | RTFM v0.4 introduces *software* tasks, tasks that can be *spawned* on-demand
181 | from any context. The runtime will dispatch all the software tasks that run at
182 | the same priority from the same interrupt handler so you won't run out of
183 | interrupt handlers even if you have dozens of tasks.
184 |
185 | Software tasks come in handy when you want to keep a hardware task responsive
186 | to events: you can defer non time critical bits to a software task that runs
187 | at lower priority.
188 |
189 | ``` rust
190 | // heapless = "0.4.1"
191 | use heapless::{consts::U16, Vec};
192 |
193 | #[rtfm::app(device = lm3s6965)]
194 | const APP: () = {
195 | // ..
196 |
197 | // high priority hardware task
198 | // started when a new byte of data is received
199 | // needs to finish relatively quickly or incoming data will be lost
200 | #[interrupt(priority = 2, spawn = [some_command])]
201 | fn UART0() {
202 | // Fixed capacity vector with inline storage (no heap memory is used)
203 | static mut BUFFER: Vec = Vec::new();
204 |
205 | let byte = read_byte_from_serial_port();
206 |
207 | if byte == b'\n' {
208 | match &BUFFER[..] {
209 | b"some-command" => spawn.some_command().unwrap(),
210 | // .. handle other cases ..
211 | }
212 |
213 | BUFFER.clear();
214 | } else {
215 | if BUFFER.push(byte).is_err() {
216 | // .. handle error (buffer is full) ..
217 | }
218 | }
219 | }
220 |
221 | // lower priority software task
222 | // only runs when `UART0` is not running
223 | // this task can be preempted by `UART0`, which has higher priority
224 | #[task(priority = 1)]
225 | fn some_command() {
226 | // .. do non time critical stuff that takes a while to execute ..
227 | }
228 |
229 | // ..
230 | };
231 | ```
232 |
233 | # Message passing
234 |
235 | When you spawn a task you can also send a message which will become the input of
236 | the task. Message passing can remove the need for explicit memory sharing and
237 | locks (see [`rtfm::Mutex`]) .
238 |
239 | [`rtfm::Mutex`]: https://japaric.github.io/cortex-m-rtfm/api/rtfm/trait.Mutex.html
240 |
241 | Using message passing we can change the previous example to handle all commands
242 | from a single task.
243 |
244 | ``` rust
245 | pub enum Command {
246 | Foo,
247 | Bar(u8),
248 | // ..
249 | }
250 |
251 | #[rtfm::app(device = lm3s6965)]
252 | const APP: () = {
253 | // ..
254 |
255 | #[interrupt(priority = 2, spawn = [run_command])]
256 | fn UART0() {
257 | static mut BUFFER: Vec = Vec::new();
258 |
259 | let byte = read_byte_from_serial_port();
260 |
261 | if byte == b'\n' {
262 | match &BUFFER[..] {
263 | // NOTE: this changed!
264 | b"foo" => spawn.run_command(Command::Foo).ok().unwrap(),
265 | // .. handle other cases ..
266 | }
267 |
268 | BUFFER.clear();
269 | } else {
270 | if BUFFER.push(byte).is_err() {
271 | // .. handle error (buffer is full) ..
272 | }
273 | }
274 | }
275 |
276 | // NOTE: NEW!
277 | // (the default priority for tasks is 1 so we can actually omit it here)
278 | #[task]
279 | fn run_command(command: Command) {
280 | match command {
281 | Command::Foo => { /* .. */ }
282 | // ..
283 | }
284 | }
285 |
286 | // ..
287 | };
288 | ```
289 |
290 | Furthermore, unlike hardware tasks, software tasks are buffered so you can spawn
291 | several instances of them: all the posted messages will be queued and executed
292 | in FIFO order.
293 |
294 | All the internal buffers used by the RTFM runtime are statically allocated so
295 | RTFM doesn't depend on a dynamic memory allocator. Instead, you specify the
296 | capacity of the message queue in the `#[task]` attribute -- the capacity
297 | defaults to 1 if not explicitly stated.
298 |
299 | If in our running example we expect that some command will take long enough to
300 | execute that another command may arrive in the meanwhile then we can increase
301 | the capacity of the message queue.
302 |
303 | ``` rust
304 | #[rtfm::app(device = lm3s6965)]
305 | const APP: () = {
306 | // ..
307 |
308 | // now we can receive up to 2 more commands while this runs
309 | #[task(capacity = 2)]
310 | fn run_command(command: Command) {
311 | match command {
312 | Command::Foo => { /* .. */ }
313 | // ..
314 | }
315 | }
316 |
317 | // ..
318 | };
319 | ```
320 |
321 | # Timer queue
322 |
323 | The RTFM framework provides an opt-in `timer-queue` feature (NOTE: ARMv7-M only
324 | feature, for now). When enabled a global timer queue is added to the RTFM
325 | runtime. This timer queue can be used to `schedule` tasks to run at some time in
326 | the future.
327 |
328 | One of the main uses cases of the `schedule` API (also see [`rtfm::Instant`] and
329 | [`rtfm::Duration`]) is creating periodic tasks.
330 |
331 | [`rtfm::Instant`]: https://japaric.github.io/cortex-m-rtfm/api/rtfm/struct.Instant.html
332 | [`rtfm::Duration`]: https://japaric.github.io/cortex-m-rtfm/api/rtfm/struct.Duration.html
333 |
334 | ``` rust
335 | const PERIOD: u32 = 12_000_000; // clock cycles == one second
336 |
337 | #[rtfm::app(device = lm3s6965)]
338 | const APP: () = {
339 | #[init(spawn = [periodic])]
340 | fn init() {
341 | // bootstrap the `periodic` task
342 | spawn.periodic().unwrap();
343 | }
344 |
345 | #[task(schedule = [periodic])]
346 | fn periodic() {
347 | // .. do stuff ..
348 |
349 | // schedule this task to run at `PERIOD` clock cycles after
350 | // it was last `scheduled` to run
351 | schedule.periodic(scheduled + PERIOD.cycles()).unwrap();
352 | }
353 |
354 | // ..
355 | };
356 | ```
357 |
358 | # What's next?
359 |
360 | To compile on stable some sacrifices had to be made in terms of (static) memory
361 | usage and code size. As there's no way to have uninitialized memory in `static`
362 | variables I had to rely on `Option`s and late (runtime) initialization in
363 | several places. But once `MaybeUninit` and `const fn` with trait bounds make
364 | their way into stable I'll be able to remove all that unnecessary overhead.
365 |
366 | More importantly though, I've been [playing] with Cortex-R processors, multicore
367 | devices and asymmetric multiprocessing (AKA AMP)! And I'm happy to report that
368 | not only have I got RTFM running on a Cortex-**R** core but I also have
369 | implemented a proof of concept for *multicore* RTFM!
370 |
371 | [playing]: https://mobile.twitter.com/japaric_io/status/1071116410166935553
372 |
373 | This is what my current multicore RTFM prototype looks like:
374 |
375 | ``` rust
376 | #![no_main]
377 | #![no_std]
378 |
379 | extern crate panic_dcc;
380 |
381 | use dcc::dprintln;
382 |
383 | const LIMIT: u32 = 5;
384 |
385 | #[rtfm::app(cores = 2)] // <- TWO cores!
386 | const APP: () = {
387 | #[cfg(core = "0")]
388 | #[init]
389 | fn init() {
390 | // nothing to do here
391 | }
392 |
393 | // this task runs on the first core
394 | #[cfg(core = "0")]
395 | #[task(spawn = [pong])]
396 | fn ping(x: u32) {
397 | dprintln!("ping({})", x);
398 |
399 | if x < LIMIT {
400 | // here we send a mesasge to the other core!
401 | spawn.pong(x + 1).unwrap();
402 | }
403 | }
404 |
405 | #[cfg(core = "1")]
406 | #[init(tasks = [pong])]
407 | fn init() {
408 | // spawn the local `pong` task
409 | spawn.pong(0).unwrap();
410 | }
411 |
412 | // this task runs on the second core
413 | #[cfg(core = "1")]
414 | #[task(spawn = [ping])]
415 | fn pong(x: u32) {
416 | dprintln!("pong({})", x);
417 |
418 | if x < LIMIT {
419 | // another cross-core message!
420 | spawn.ping(x + 1).unwrap();
421 | }
422 | }
423 | };
424 | ```
425 |
426 | ``` console
427 | $ # logs from the first core
428 | $ tail -f dcc0.log
429 | IRQ(ICCIAR { cpuid: 1, ackintid: 65 })
430 | ping(1)
431 | ~IRQ(ICCIAR { cpuid: 1, ackintid: 65 })
432 | IRQ(ICCIAR { cpuid: 1, ackintid: 65 })
433 | ping(3)
434 | ~IRQ(ICCIAR { cpuid: 1, ackintid: 65 })
435 | IRQ(ICCIAR { cpuid: 1, ackintid: 65 })
436 | ping(5)
437 | ~IRQ(ICCIAR { cpuid: 1, ackintid: 65 })
438 | ```
439 |
440 | ``` console
441 | $ # logs from the second core
442 | $ tail -f dcc1.log
443 | IRQ(ICCIAR { cpuid: 1, ackintid: 66 })
444 | pong(0)
445 | ~IRQ(ICCIAR { cpuid: 1, ackintid: 66 })
446 | IRQ(ICCIAR { cpuid: 0, ackintid: 66 })
447 | pong(2)
448 | ~IRQ(ICCIAR { cpuid: 0, ackintid: 66 })
449 | IRQ(ICCIAR { cpuid: 0, ackintid: 66 })
450 | pong(4)
451 | ~IRQ(ICCIAR { cpuid: 0, ackintid: 66 })
452 | ```
453 |
454 | In this PoC, you write multicore applications in a single crate and you use
455 | `#[cfg(core = "*")]` to assign tasks and resources to one core or the other.
456 | Also, you can send messages across cores in a lock-free, wait-free, alloc-free
457 | manner.
458 |
459 | I have tested this PoC on a dual core Cortex-R5 device but I'm certain that the
460 | approach can be adapted to heterogeneous devices (e.g. Cortex-M4 + Cortex-M0+)
461 | which are more common in the microcontroller space.
462 |
463 | This sounds nice and all but, unfortunately, this PoC is not *completely* memory
464 | safe and thus not ready for show time. It has a few memory safety holes around
465 | its uses of `Send` and `Sync` that I'm not sure how best to solve.
466 |
467 | To give you an example of the issues I'm thinking about: something that's
468 | *single-core* `Sync`, like [`bare_metal::Mutex`], is not necessarily
469 | *multi-core* `Sync` (e.g. [`spin::Mutex`] is multi-core `Sync`) but there's only
470 | one widely used `Sync` trait, which most people understand as multi-core `Sync`.
471 | I can create my own `SingleCoreSync` but will the community adopt it? More
472 | importantly, if we change `bare_metal::Mutex` to only implement `SingleCoreSync`
473 | (and make it sound to use in the multicore RTFM model) then you won't be able to
474 | use it in `static` variables (those require a `Sync` bound) which is a valid use
475 | case today.
476 |
477 | [`bare_metal::Mutex`]: https://docs.rs/bare-metal/0.2.4/bare_metal/struct.Mutex.html
478 | [`spin::Mutex`]: https://docs.rs/spin/0.4.10/spin/struct.Mutex.html
479 |
480 | Another example: a `&'static mut T` (or a `Box`) is a safe thing to `Send`
481 | from one task to another *within a core* but across cores safety depends on
482 | where the reference points to. If it points to memory shared between the cores
483 | then all's good, but if it points to memory that's only visible to one of the
484 | cores (e.g. Tightly Coupled Memory) then the operation is UB. The problem is
485 | that you can't tell where the reference points to by just looking at the type
486 | because the location is specified using an attribute (`#[link_section]`).
487 |
488 | I plan to do a more detailed blog post about `no_std` AMP in Rust. Hopefully,
489 | the Rust community will give me some good ideas about how to deal with these
490 | problems!
491 |
492 | Until next time!
493 |
494 | ---
495 |
496 | __Thank you patrons! :heart:__
497 |
498 | I want to wholeheartedly thank:
499 |
500 |
507 |
508 | [Iban Eguia],
509 | [Geoff Cant],
510 | [Harrison Chin],
511 | [Brandon Edens],
512 | [whitequark],
513 | [James Munns],
514 | [Fredrik Lundström],
515 | [Kjetil Kjeka],
516 | [Kor Nielsen],
517 | [Alexander Payne],
518 | [Dietrich Ayala],
519 | [Hadrien Grasland],
520 | [vitiral],
521 | [Lee Smith],
522 | [Florian Uekermann],
523 | [Adam Green]
524 | and 57 more people for [supporting my work on Patreon][Patreon].
525 |
526 | [Iban Eguia]: https://github.com/Razican
527 | [Geoff Cant]: https://github.com/archaelus
528 | [Harrison Chin]: http://www.harrisonchin.com/
529 | [Brandon Edens]: https://github.com/brandonedens
530 | [whitequark]: https://github.com/whitequark
531 | [James Munns]: https://jamesmunns.com/
532 | [Fredrik Lundström]: https://github.com/flundstrom2
533 | [Kjetil Kjeka]: https://github.com/kjetilkjeka
534 | [Kor Nielsen]: https://github.com/korran
535 | [Alexander Payne]: https://myrrlyn.net/
536 | [Dietrich Ayala]: https://metafluff.com/
537 | [Hadrien Grasland]: https://github.com/HadrienG2
538 | [vitiral]: https://github.com/vitiral
539 | [Lee Smith]: https://github.com/leenozara
540 | [Florian Uekermann]: https://github.com/FlorianUekermann
541 | [Adam Green]: https://github.com/adamgreen
542 |
543 | ---
544 |
545 | Let's discuss on [reddit].
546 |
547 | [reddit]: https://www.reddit.com/r/rust/comments/a7opk8/eir_real_time_for_the_masses_v04_stable_software/
548 |
549 | Enjoyed this post? Like my work on embedded stuff? Consider supporting my work
550 | on [Patreon]!
551 |
552 | [Patreon]: https://www.patreon.com/japaric
553 |
554 | Follow me on [twitter] for even more embedded stuff.
555 |
556 | [twitter]: https://twitter.com/japaric_io
557 |
558 | The embedded Rust community gathers on the #rust-embedded IRC channel
559 | (irc.mozilla.org). Join us!
560 |
--------------------------------------------------------------------------------
/content/stack-overflow-protection.md:
--------------------------------------------------------------------------------
1 | +++
2 | author = "Jorge Aparicio"
3 | date = 2018-02-17T18:16:39+01:00
4 | draft = false
5 | tags = ["ARM Cortex-M", "safety"]
6 | title = "Zero cost stack overflow protection for ARM Cortex-M devices"
7 | +++
8 |
9 | One of the core features of Rust is memory safety. Whenever possible the compiler enforces memory
10 | safety at compile. One example of this is the borrow checker which prevents data races, iterator
11 | invalidation, pointer invalidation and other issues at compile time. Other memory problems like
12 | buffer overflows can't be prevented at compile time. In those cases the compiler inserts runtime
13 | checks, bounds checks in this case, to enforce memory safety at runtime.
14 |
15 | What about stack overflows? For quite a long time Rust didn't have stack overflow checking but that
16 | wasn't much of a problem on tier 1 platforms since these platforms have an OS and a MMU (Memory
17 | Management Unit) that prevents stack overflows from wreaking havoc.
18 |
19 | Consider this (silly) program that calls a recursive function that allocates a 1 MB array on the
20 | stack.
21 |
22 | ``` rust
23 | fn main() {
24 | println!("{}", fib(10));
25 | }
26 |
27 | #[inline(never)]
28 | fn fib(n: u64) -> u64 {
29 | let _use_stack = [0u8; 1024 * 1024];
30 |
31 | if n < 2 {
32 | 1
33 | } else {
34 | fib(n - 1) + fib(n - 2)
35 | }
36 | }
37 | ```
38 |
39 | If you run this safe program using last year nightly you get a segmentation fault.
40 |
41 | ``` console
42 | $ # last year nightly
43 | $ cargo run +nightly-2017-02-16
44 | [1] 15156 segmentation fault (core dumped) cargo run +nightly-2017-02-16
45 | ```
46 |
47 | But if you run it with a recent nightly you'll get an abort and a meaningful error message.
48 |
49 | ``` console
50 | $ cargo run +nightly-2018-02-16
51 | thread 'main' has overflowed its stack
52 | fatal runtime error: stack overflow
53 | [1] 16042 abort (core dumped) cargo run +nightly-2018-02-16
54 | ```
55 |
56 | The difference in behavior is due to *stack probe* support landing in rustc / LLVM last year. Like
57 | bounds checks, stack probes are also a runtime memory safety mechanism but for catching stack
58 | overflows. At the time of writing only x86 / x86_64 has stack probe support in rustc / LLVM.
59 |
60 | # MMU-less devices
61 |
62 | But what's the effect of a stack overflow on bare metal devices that have no OS or a MMU like the
63 | ARM Cortex-M?
64 |
65 | Let's find out with this (silly) program:
66 |
67 | ``` rust
68 | #![no_std]
69 |
70 | extern crate cortex_m;
71 | extern crate stm32f103xx;
72 |
73 | use cortex_m::asm;
74 |
75 | const PATTERN: u32 = 0xdeadbeef;
76 |
77 | // initialize some RAM to a known bit pattern
78 | static mut DATA: [u32; 1024] = [PATTERN; 1024];
79 |
80 | fn main() {
81 | asm::bkpt();
82 |
83 | let _x = fib(100);
84 | }
85 |
86 | #[inline(never)]
87 | fn fib(n: u32) -> u32 {
88 | if unsafe { DATA.last() } != Some(&PATTERN) {
89 | // `DATA` never changes so this should be unreachable, right?
90 | asm::bkpt();
91 | }
92 |
93 | // allocate and zero a 1KB of stack memory
94 | let _use_stack = [0u8; 1024];
95 |
96 | if n < 2 {
97 | 1
98 | } else {
99 | fib(n - 1) + fib(n - 2)
100 | }
101 | }
102 | ```
103 |
104 | You can probably guess how this will go ... If you debug this program and inspect the memory where
105 | `DATA` is located at the first breakpoint, before `fib` is called, you'll see something like this:
106 |
107 | ``` console
108 | > # GDB
109 | > continue
110 | overflow::main () at src/main.rs:14
111 | 14 asm::bkpt();
112 |
113 | > # breakpoint in `main`
114 |
115 | > x/1028x 0x20000000 # inspect the DATA variable
116 | 0x20000000: 0xdeadbeef 0xdeadbeef 0xdeadbeef 0xdeadbeef # start of DATA
117 | (..)
118 | 0x20000ff0: 0xdeadbeef 0xdeadbeef 0xdeadbeef 0xdeadbeef # end of DATA
119 | 0x20001000: 0xc260b0e9 0xda79849d 0x517bb7fa 0xa84886ba # uninitialized RAM
120 | ```
121 |
122 | That matches the expected bit pattern. So far so good.
123 |
124 | If you resume the program until it hits the second breakpoint, the one inside the `fib` function,
125 | you'll see this:
126 |
127 | ``` console
128 | > continue
129 | overflow::fib (n=86) at src/main.rs:22
130 | 22 asm::bkpt();
131 |
132 | > # breakpoint in `fib`
133 |
134 | > x/1028x 0x20000000
135 | 0x20000000: 0xdeadbeef 0xdeadbeef 0xdeadbeef 0xdeadbeef # start of DATA
136 | (..)
137 | 0x20000fb0: 0xdeadbeef 0xdeadbeef 0xdeadbeef 0xdeadbeef
138 | 0x20000fc0: 0x20000ffc 0x08001070 0x20000ffc 0x08001070
139 | 0x20000fd0: 0xdeadbeef 0x00000001 0x2000107c 0x08001074
140 | 0x20000fe0: 0x2000107c 0x08001074 0x20001048 0x0800036b
141 | 0x20000ff0: 0x20000000 0x00000001 0x00000000 0x00000001 # end of DATA
142 | 0x20001000: 0x00000000 0x00000001 0x2000107c 0x08001074
143 | ```
144 |
145 | The `DATA` variable has been silently corrupted! Although this program has some `unsafe` code the
146 | memory corruption is not caused by the `unsafe` code; it is caused by calling the `fib` function,
147 | which is safe to call.
148 |
149 | This means that ARM Cortex-M programs which only contain safe code can run into memory corruption
150 | issues and that goes against Rust core feature of being memory safe. Let's fix it!
151 |
152 | # Fixing it
153 |
154 | Stack probes seems like the right way to fix this, but unfortunately stack probe support is only
155 | available on x86 and here we are talking about the ARM Cortex-M architecture. There's another
156 | problem as well: the [x86 implementation] of stack probes assumes there's some paging (virtual
157 | memory) mechanism being used so that implementation can't be directly translated to bare metal ARM.
158 | Finally, stack probes impose a runtime overhead on function calls so it's not a zero cost solution.
159 |
160 | [x86 implementation]: https://github.com/rust-lang-nursery/compiler-builtins/blob/0ba07e49264a54cb5bbd4856fcea083bb3fbec15/src/probestack.rs#L50
161 |
162 | Thankfully, there's another way to fix this and that's truly zero cost. Before I explain it let me
163 | first show you how stack overflows cause memory corruption.
164 |
165 | This is the memory layout of a bare metal Cortex-M program like the one I showed before.
166 |
167 |
168 |
169 |
170 |
171 | Static variables, like the `DATA` variable from the previous program, are stored at the bottom
172 | (start) of RAM, in the `.bss` and `.data` sections, which are fixed in size. The stack is located
173 | at the top (end) of RAM and it grows downwards. If the stack grows too large it can crash into
174 | the `.bss+.data` section, overwriting it; this corrupts `static` variables.
175 |
176 | The way to prevent stack overflows from corrupting memory is simple: you place the `.bss+.data`
177 | section at the *top* of RAM and put the stack below it. Like this:
178 |
179 |
180 |
181 |
182 |
183 | In this scenario when the stack grows too large it ends up crashing into the boundary of the RAM
184 | region and that triggers a *hard fault* exception. With this layout the `static` variables remain
185 | safe during a stack overflow condition. Nice!
186 |
187 | ## `cortex-m-rt-ld`
188 |
189 | Now all we need to do is change the memory layout of the program. The [`cortex-m-rt`] crate decides
190 | the memory layout by providing a linker script to the linker. This linker script describes the
191 | memory layout of the program in a declarative manner (details [here], if you are interested).
192 |
193 | [here]: https://sourceware.org/binutils/docs/ld/Scripts.html
194 |
195 | The problem is that linker scripts don't support arranging memory as we want: they only let you
196 | specify the *start* address of sections like `.bss+.data` but in this case we want to specify the
197 | *end* address of `.bss+.data`. We can't specify the start address of `.bss+.data` to be
198 | `0x2000_4000` or some other fixed number because the correct number depends on the size of the
199 | `.bss+.data` section and linker scripts don't provide support to get the size of an *output* section
200 | -- simply because the size is not known at link time; the size of a section will only be known
201 | *after* the linking process.
202 |
203 | [`cortex-m-rt`]: https://crates.io/crates/cortex-m-rt
204 | [linker script]: https://github.com/japaric/cortex-m-rt/blob/v0.3.13/link.x
205 |
206 | The workaround for this missing linker script functionality is ... to link the program *twice* --
207 | this technique is [also used in the C world][c]. Linking is done the first time to figure out the
208 | size of the `.bss+.data` section; after linking you can run `arm-none-eabi-size` over the output
209 | binary and find out the size. In the second linking step we feed the size of the section to the
210 | linker script, as a *hardcoded* number, and use that to select the right start address of the
211 | `.bss+.data` section.
212 |
213 | [c]: https://stackoverflow.com/a/39477543
214 |
215 | In C this two step linking is done using Makefiles. We can't replicate that approach in Rust because
216 | it requires having the user explicitly write down the linker invocations and in Rust land linking is
217 | done transparently by `rustc` / Cargo.
218 |
219 | So what we'll do instead is to use a *linker wrapper*. Instead of linking the program using
220 | `arm-none-eabi-ld` we'll use a linker wrapper called [`cortex-m-rt-ld`]. This wrapper is a Rust
221 | program that will call the linker twice.
222 |
223 | The only thing a user needs to do, apart from installing `cortex-m-rt-ld`, is to change the linker
224 | in Cargo's configuration file:
225 |
226 | [`cortex-m-rt-ld`]: https://crates.io/crates/cortex-m-rt-ld
227 |
228 | ``` console
229 | $ # this file comes from the cortex-m-quickstart template v0.2.4
230 | $ cat .cargo/config
231 | [target.thumbv7m-none-eabi]
232 | runner = 'arm-none-eabi-gdb'
233 | rustflags = [
234 | "-C", "link-arg=-Tlink.x",
235 | "-C", "linker=cortex-m-rt-ld", # <- CHANGED!
236 | "-Z", "linker-flavor=ld",
237 | "-Z", "thinlto=no",
238 | ]
239 |
240 | [build]
241 | target = "thumbv7m-none-eabi"
242 | ```
243 |
244 | This will make `rustc` invoke `cortex-m-rt-ld` with all the arguments it would normally pass to
245 | `arm-none-eabi-ld`.
246 |
247 | ## In practice
248 |
249 | Let's put this technique in practice by relinking the Cortex M program I showed before. But before
250 | we do that let's look at the linker sections of the binary we debugged.
251 |
252 | ``` console
253 | $ arm-none-eabi-size -Ax target/thumbv7m-none-eabi/debug/overflow
254 | section size addr
255 | .vector_table 0x130 0x8000000
256 | .text 0xeb2 0x8000130
257 | .rodata 0x294 0x8000ff0
258 | .stack 0x5000 0x20000000
259 | .bss 0x0 0x20000000
260 | .data 0x1000 0x20000000
261 | ```
262 |
263 | This output shows the start addresses and the sizes of the `.stack`, `.bss` and `.data` sections.
264 | From the output you can see that they overlap: `.stack` starts at address `0x2000_5000` and ends at
265 | address `0x2000_0000` (remember that it grows downwards); `.data` starts at address `0x2000_0000`
266 | and ends at address
267 | `0x2000_1000`.
268 |
269 | Now let's relink the program using `cortex-m-rt-ld` and look at the linker sections again.
270 |
271 | ``` console
272 | $ arm-none-eabi-size -Ax target/thumbv7m-none-eabi/debug/overflow
273 | section size addr
274 | .vector_table 0x130 0x8000000
275 | .text 0xeb2 0x8000130
276 | .rodata 0x294 0x8000ff0
277 | .stack 0x4000 0x20000000
278 | .bss 0x0 0x20004000
279 | .data 0x1000 0x20004000
280 | ```
281 |
282 | Now the sections don't overlap! `.stack` starts at address `0x2000_4000` and ends at address
283 | `0x2000_0000`; `.data` starts at address `0x2000_4000` and ends at address `0x2000_5000`.
284 |
285 | I mentioned that on stack overflow a hard fault exception would be triggered. Turns out we can
286 | define *how* that is handled using the `exception!` macro so we can choose how the program should
287 | behave on a stack overflow condition.
288 |
289 | ``` rust
290 | #![no_std]
291 |
292 | extern crate cortex_m;
293 | #[macro_use(exception)] // NEW!
294 | extern crate stm32f103xx;
295 |
296 | // same program as before
297 |
298 | // NEW!
299 | exception!(HARD_FAULT, on_stack_overflow);
300 |
301 | #[inline(always)]
302 | fn on_stack_overflow() {
303 | asm::bkpt();
304 | }
305 | ```
306 |
307 | Now let's run this program.
308 |
309 | ```
310 | > # GDB
311 | > continue
312 | overflow::main () at src/main.rs:15
313 | 15 asm::bkpt();
314 |
315 | > # breakpoint in `main`
316 |
317 | > continue
318 | HARD_FAULT () at :14
319 | 14 : No such file or directory.
320 |
321 | > # breakpoint in `on_stack_overflow`
322 |
323 | > x/1028x 0x20003ff0
324 | 0x20003ff0: 0x00000000 0x00000000 0x00000014 0xffffffff
325 | 0x20004000: 0xdeadbeef 0xdeadbeef 0xdeadbeef 0xdeadbeef # start of DATA
326 | (..)
327 | 0x20004ff0: 0xdeadbeef 0xdeadbeef 0xdeadbeef 0xdeadbeef # end of DATA
328 | ```
329 |
330 | This time we hit the `HARD_FAULT` exception handler during the stack overflow and the `DATA`
331 | variable remained intact.
332 |
333 | # What if I have a heap?
334 |
335 | When you have a heap and you use the standard memory layout you can run into two different problems:
336 | a stack overflow can overwrite the `.heap`; and memory allocations can make the `.heap` grow too
337 | large and crash into the `.stack`, overwriting it.
338 |
339 |
340 |
341 |
342 |
343 | Again, tweaking the memory layout can prevent the problem. If you place the `.heap` at the top of
344 | the RAM, place `.bss+.data` below it and the `.stack` below that then you avoid memory corruption in
345 | both scenarios.
346 |
347 |
348 |
349 |
350 |
351 | `cortex-m-rt-ld` supports this memory layout but it requires you to specify the size of the `.heap`
352 | in a linker script. You can do that by adding a `_heap_size` symbol to `memory.x`, if you are
353 | providing that file; or by passing a new linker script that provides that symbol to the linker.
354 |
355 | The former will look like this:
356 |
357 | ``` console
358 | $ tail -n1 memory.x
359 | _heap_size = 0x400; /* 1 KB */
360 | ```
361 |
362 | And the latter will look like this:
363 |
364 | ``` console
365 | $ echo '_heap_size = 0x400;' > heap.x
366 |
367 | $ cat .cargo/config
368 | [target.thumbv7m-none-eabi]
369 | runner = 'arm-none-eabi-gdb'
370 | rustflags = [
371 | "-C", "link-arg=-Tlink.x",
372 | "-C", "link-arg=-Theap.x", # NEW!
373 | "-C", "linker=cortex-m-rt-ld",
374 | "-Z", "linker-flavor=ld",
375 | "-Z", "thinlto=no",
376 | ]
377 |
378 | [build]
379 | target = "thumbv7m-none-eabi"
380 | ```
381 |
382 | Here are the linker sections of our running example after adding a 1 KB heap and linking it using
383 | `cortex-m-rt-ld`.
384 |
385 | ``` rust
386 | $ arm-none-eabi-size -Ax target/thumbv7m-none-eabi/debug/overflow
387 | section size addr
388 | .vector_table 0x130 0x8000000
389 | .text 0xe8e 0x8000130
390 | .rodata 0x294 0x8000fc0
391 | .stack 0x3c00 0x20000000
392 | .bss 0x0 0x20003c00
393 | .data 0x1000 0x20003c00
394 | .heap 0x400 0x20004c00
395 | ```
396 |
397 | Note how `.bss`, `.data` and `.stack` have been pushed *down* (towards a lower address) by the
398 | `.heap`.
399 |
400 | # Other configurations?
401 |
402 | Currently `cortex-m-rt-ld` doesn't support memory layouts that involve more than one RAM region but
403 | we don't have great support for that in `cortex-m-rt` either so there's no much point in supporting
404 | that in `cortex-m-rt-ld` at the moment.
405 |
406 | The approach described here doesn't help if you are using threads, where each one has its own stack.
407 | In that scenario the thread stacks are laid out contiguously in memory and no amount of shuffling
408 | around will prevent one from overflowing into the other. There pretty much your only choice is to
409 | use a MPU (Memory Protection Unit) -- assuming your microcontroller has one -- to create stack
410 | boundaries on demand. Using the MPU is not zero cost as there's some setup involved on each context
411 | switch.
412 |
413 | # Conclusion
414 |
415 | That's it. Protect your ARM Cortex-M program from stack overflows and make it truly memory safe by
416 | just swapping out the linker!
417 |
418 | ---
419 |
420 | __Thank you patrons! :heart:__
421 |
422 | I want to wholeheartedly thank:
423 |
424 |