├── .github └── workflows │ ├── build.yml │ └── deploy.yaml ├── LICENSE.md ├── package.json ├── spec └── index.html └── README.md /.github/workflows/build.yml: -------------------------------------------------------------------------------- 1 | name: Build spec 2 | 3 | on: pull_request 4 | 5 | jobs: 6 | build: 7 | runs-on: ubuntu-latest 8 | 9 | steps: 10 | - uses: actions/checkout@v2 11 | - uses: actions/setup-node@v3 12 | with: 13 | node-version: 16 14 | - run: npm install 15 | - run: npm run build 16 | -------------------------------------------------------------------------------- /.github/workflows/deploy.yaml: -------------------------------------------------------------------------------- 1 | name: Deploy gh-pages 2 | 3 | on: 4 | push: 5 | branches: 6 | - main 7 | 8 | jobs: 9 | deploy: 10 | runs-on: ubuntu-latest 11 | 12 | steps: 13 | - uses: actions/checkout@v2 14 | - uses: actions/setup-node@v3 15 | with: 16 | node-version: 16 17 | - run: npm install 18 | - run: npm run build 19 | - uses: JamesIves/github-pages-deploy-action@v4.3.3 20 | with: 21 | branch: gh-pages 22 | folder: build 23 | clean: true 24 | -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- 1 | This repository is licensed according to Ecma International TC39's [Intellectual Property Policy](https://github.com/tc39/how-we-work/blob/HEAD/ip.md). In particular: 2 | - Natural language text is licensed under the "Alternative copyright notice" of the [Ecma text copyright policy](https://www.ecma-international.org/memento/Ecma%20copyright%20faq.htm). 3 | - Source code is licensed under Ecma's MIT-style [Ecma International Policy on Submission, Inclusion and Licensing of Software](https://www.ecma-international.org/memento/Policies/Ecma_Policy_on_Submission_Inclusion_and_Licensing_of_Software.htm). 4 | - Contributions are only accepted from either representatives of Ecma members or signatories of TC39's [Contributor Form](https://tc39.github.io/agreements/contributor/). 5 | -------------------------------------------------------------------------------- /package.json: -------------------------------------------------------------------------------- 1 | { 2 | "private": true, 3 | "name": "proposal-atomics-microwait", 4 | "description": "Proposal for Atomics.microwait (build)", 5 | "license": "Fair", 6 | "scripts": { 7 | "build": "npm run build-loose -- --strict", 8 | "build-loose": "node -e 'fs.mkdirSync(\"build\", { recursive: true })' && ecmarkup --load-biblio @tc39/ecma262-biblio --verbose spec/index.html build/index.html --js-out build/ecmarkup.js --css-out build/ecmarkup.css --lint-spec", 9 | "format": "emu-format --write 'spec/*.html'" 10 | }, 11 | "devDependencies": { 12 | "@tc39/ecma262-biblio": "^2.1.2407", 13 | "ecmarkup": "^15.0.0" 14 | }, 15 | "version": "1.0.0", 16 | "repository": { 17 | "type": "git", 18 | "url": "git+https://github.com/tc39/proposal-atomics-microwait.git" 19 | }, 20 | "bugs": { 21 | "url": "https://github.com/tc39/proposal-atomics-microwait/issues" 22 | }, 23 | "homepage": "https://github.com/tc39/proposal-atomics-microwait#readme" 24 | } 25 | -------------------------------------------------------------------------------- /spec/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 |

 4 | title: Atomics.pause
 5 | stage: 3
 6 | contributors: Shu-yu Guo
 7 | markEffects: true
 8 |

9 | 10 | 11 |

Atomics.pause ( [ _N_ ] )

12 |

This method performs the following steps when called:

13 | 14 | 1. If _N_ is neither *undefined* nor an integral Number, throw a *TypeError* exception. 15 | 1. If the execution environment of the ECMAScript implementation supports signaling to the operating system or CPU that the current executing code is in a spin-wait loop, such as executing a `pause` CPU instruction, send that signal. When _N_ is not *undefined*, it determines the number of times that signal is sent. The number of times the signal is sent for an integral Number _N_ is less than or equal to the number times it is sent for _N_ + 1 if both _N_ and _N_ + 1 have the same sign. 16 | 1. Return *undefined*. 17 | 18 | 19 |

This method is designed for programs implementing spin-wait loops, such as spinlock fast paths inside of mutexes, to provide a hint to the CPU that it is spinning while waiting on a value. It has no observable behaviour other than timing.

20 |

Implementations are expected to implement a pause or yield instruction if the best practices of the underlying architecture recommends such instructions in spin loops. For example, the Intel Optimization Manual recommends the pause instruction.

21 | 22 | 23 |

The _N_ parameter controls how long an implementation pauses. Larger values result in longer waits. Implementations are encouraged to have an internal upper bound on the maximum amount of time paused on the order of tens to hundreds of nanoseconds.

24 |

The user, if passing the _N_ parameter, can use it to implement backoff strategies. For backoff strategies where subsequent waits become longer, linear backoff can be implemented by passing linearly increasing values for non-negative _N_; exponential backoff can be implemented by exponentially increasing values for non-negative _N_, and so on. For backoff strategies where subsequent waits become shorter, linear backoff can be implemented by passing linearly decreasing values for negative _N_; exponential backoff where subsequent waits become longer can be implemented by exponentially decreasing values for negative _N_, and so on.

25 | 26 | 27 |

Due to the overhead of function calls, it is reasonable that an inlined call to this method in an optimizing compiler waits a different amount of time than a non-inlined call.

28 | 29 | 30 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Micro waits in JS 2 | 3 | Stage: 3 4 | 5 | Author: Shu-yu Guo 6 | 7 | Champion: Shu-yu Guo 8 | 9 | ## Motivation 10 | Efficient implementation of the locks use a loop body of the following shape for acquiring the lock: 11 | 12 | ```javascript 13 | // Fast path 14 | let spins = 0; 15 | do { 16 | if (TryLock()) { 17 | // Lock acquired. 18 | return; 19 | } 20 | 21 | SpinForALittleBit(); 22 | spins++; 23 | } while (spins < kSpinCount); 24 | 25 | // Slow path 26 | PutThreadToSleepUntilLockReleased(); 27 | ``` 28 | 29 | This algorithm is both fast when contention is low and efficient when contention is high. When contention is low (i.e. the likelihood that the lock is released after a very short amount of time is high), spinning for a little bit improves the performance because the code does not re-enter the kernel to yield or to sleep. When contention is high (i.e. the likelihood that the lock is released after a very short amount of time is low), going back to the kernel to put the executing thread to sleep improves efficiency. 30 | 31 | `SpinForALittleBit()` is impossible to optimally write in JS, as the optimal version often requires hinting the CPU to allow sibling cores to access shared resources. On x86, for example, the [Intel optimization manual](https://www.intel.com/content/www/us/en/content-details/671488/intel-64-and-ia-32-architectures-optimization-reference-manual-volume-1.html) recommends a loop with the `pause` instruction and exponential backoff. Versions without this hinting suffer performance and scheduling issues. 32 | 33 | `PutThreadToSleepUntilLockReleased()` is impossible to optimally write on the main thread, as blocking is disallowed on the main thread as a policy choice to prevent deadlocks and hangs in the browser UI. **This proposal does not seek to solve this use case**. 34 | 35 | ### Use case: Emscripten 36 | 37 | Emscripten uses a [busy loop](https://github.com/emscripten-core/emscripten/blob/bc5998833dcd0f48e90a8cb13fdf40e36480e4cb/system/lib/pthread/emscripten_futex_wait.c#L20-L112) to emulate a blocking wait in its implementation of futexes on the main thread, which are in turn used to implement mutexes. 38 | 39 | Allowing microwaits will improve the power and scheduling efficiency of multithreaded applications compiled using Emscripten. 40 | 41 | ## Proposal 42 | 43 | I propose one new method on `Atomics`. 44 | 45 | ### `Atomics.pause` 46 | 47 | For better spinning, add `Atomics.pause(N)`. It performs a finite-time wait for a very short time that runtimes can implement with the appropriate CPU hinting. It has no observable behavior other than timing. 48 | 49 | Unlike `Atomics.wait`, since it does not block, it can be called from both the main thread and worker threads. 50 | 51 | Implementations are expected to implement a short spin with CPU yielding, using best practices for the underlying architecture. The non-negative integer argument `N` controls the pause time, with larger values of `N` pausing for longer. It can be used to implement backoff algorithms when the microwait itself is in a loop. 52 | 53 | ## Prior discussions and acknowledgements 54 | 55 | Microwaits have been discussed previously when SharedArrayBuffers were proposed. See https://github.com/tc39/proposal-ecmascript-sharedmem/issues/87. In my opinion, the arguments on that thread still hold today. `Atomics.pause` is basically exactly the same as Lars's previous design. 56 | 57 | Thread yields and efficient spin loops have also been discussed in the context of WebAssembly. See https://github.com/WebAssembly/threads/issues/15. 58 | 59 | ## FAQ 60 | 61 | ### Does `Atomics.pause()` yield execution to another thread? 62 | 63 | No. Microwaiting yields shared resources in a CPU without giving up occupancy of the core itself. Thread yielding is done at the OS level instead of the CPU level. 64 | 65 | ### Why can't I block the main thread? 66 | 67 | Blocking the main thread is bad because it has catastrophic effects on responsiveness and performance of web pages. 68 | 69 | ### I still want to block the main thread, can we clamp timeouts on `Atomics.wait`? 70 | 71 | Initially, this proposal included an overload of `Atomics.wait` that allowed the timeout value to be clamped to an implementation-defined limit on the main thread. The idea was that if the blocking periods were short enough and somehow lined up with what implementations considered "idle periods", then the policy choice of disallowing the main thread from being blocked would not be violated. 72 | 73 | This overload was removed from the proposal for the following reasons: 74 | 75 | - Not enough bang for the buck. There is no consensus to allow indefinite blocking on the main thread, and bounded-time blocking is of limited value for the use case. 76 | - It is difficult to specify _how_ to clamp the timeout in a web embedding. The previous idea was to tie it to the current "idle period", which is sensitive to scheduled tasks and the presence of event handlers responding to input, etc. However, this is mechanically complicated. Moreover, there is likely to be _no_ free idle periods in an application, which suggests we may need both a floor and a ceiling for the clamping. This further raises the specification challenge for the small gain. 77 | --------------------------------------------------------------------------------