├── .gitignore ├── images ├── robinhood.gif ├── scatter.bqn ├── bar.bqn ├── util.bqn ├── gif.bqn ├── line.bqn ├── range.svg ├── rand.svg ├── rand_glide.svg ├── wolf.svg ├── parts.svg └── bad.svg ├── glide ├── Cargo.toml ├── src │ └── lib.rs └── README.md ├── wolfbench.sh ├── LICENSE ├── wolfbench.diff ├── res ├── c_rh.txt ├── c_flux.txt ├── c_quad.txt ├── r_flux.txt ├── r_rh.txt ├── r_wolf.txt ├── s_quad.txt ├── r_ska_.txt ├── s_merge.txt ├── rp_rh.txt ├── sp_rh.txt ├── wolf.txt ├── crit.txt └── crit5.txt ├── bench.c ├── crit.c ├── rhsort.c └── README.md /.gitignore: -------------------------------------------------------------------------------- 1 | a.out 2 | wolfsort 3 | runwolfbench 4 | glide/Cargo.lock 5 | glide/target/ 6 | -------------------------------------------------------------------------------- /images/robinhood.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mlochbaum/rhsort/HEAD/images/robinhood.gif -------------------------------------------------------------------------------- /glide/Cargo.toml: -------------------------------------------------------------------------------- 1 | [package] 2 | name = "glide" 3 | version = "0.1.2" 4 | edition = "2021" 5 | 6 | [dependencies] 7 | glidesort = { version = "0.1.2", features = ["unstable"] } 8 | 9 | [lib] 10 | crate-type = ["cdylib"] 11 | 12 | [profile.release] 13 | lto = "thin" 14 | -------------------------------------------------------------------------------- /glide/src/lib.rs: -------------------------------------------------------------------------------- 1 | #[no_mangle] 2 | pub unsafe extern "C" fn glidesort(x: *mut i32, len: u64) { 3 | glidesort::sort(std::slice::from_raw_parts_mut(x, len as usize)) 4 | } 5 | 6 | // Not used: didn't seem faster 7 | #[no_mangle] 8 | pub unsafe extern "C" fn glidesort_buf(x: *mut i32, len: u64, buf: *mut i32) { 9 | glidesort::sort_with_buffer( 10 | std::slice::from_raw_parts_mut(x, len as usize), 11 | std::slice::from_raw_parts_mut(std::mem::transmute(buf), (2*len) as usize) 12 | ) 13 | } 14 | -------------------------------------------------------------------------------- /wolfbench.sh: -------------------------------------------------------------------------------- 1 | #! /bin/sh 2 | 3 | set -e 4 | 5 | if [ ! -d wolfsort ] 6 | then 7 | git clone https://github.com/scandum/wolfsort.git 8 | cd wolfsort/src 9 | git apply ../../wolfbench.diff 10 | ln -s ../../rhsort.c . 11 | wget https://raw.githubusercontent.com/orlp/pdqsort/b1ef26a55cdb60d236a5cb199c4234c704f46726/pdqsort.h 12 | wget https://raw.githubusercontent.com/skarupke/ska_sort/2c14d4bf7b667032ed28a481065f721385e33849/ska_sort.hpp 13 | else 14 | cd wolfsort/src 15 | fi 16 | g++ -O3 -w -fpermissive -D SKIP_LONGS bench.c -o ../../runwolfbench 17 | echo "Created ./runwolfbench" 18 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright (c) 2022 Marshall Lochbaum 2 | 3 | Permission to use, copy, modify, and/or distribute this software for any 4 | purpose with or without fee is hereby granted. 5 | 6 | THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH 7 | REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY 8 | AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, 9 | INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM 10 | LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR 11 | OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR 12 | PERFORMANCE OF THIS SOFTWARE. 13 | -------------------------------------------------------------------------------- /wolfbench.diff: -------------------------------------------------------------------------------- 1 | diff --git a/src/bench.c b/src/bench.c 2 | index 399de1c..d645e1d 100644 3 | --- a/src/bench.c 4 | +++ b/src/bench.c 5 | @@ -43,7 +43,8 @@ 6 | 7 | #define cmp(a,b) (*(a) > *(b)) // uncomment for fast primitive comparisons 8 | 9 | -char *sorts[] = { "*", "blitsort", "crumsort", "fluxsort", "gridsort", "quadsort", "wolfsort" }; 10 | +//char *sorts[] = { "*", "blitsort", "crumsort", "fluxsort", "gridsort", "quadsort", "wolfsort" }; 11 | +char *sorts[] = { "*", "quadsort", "pdqsort", "fluxsort", "wolfsort", "ska_sort", "rhsort" }; 12 | 13 | #if __has_include("blitsort.h") 14 | #include "blitsort.h" // curl "https://raw.githubusercontent.com/scandum/blitsort/master/src/blitsort.{c,h}" -o "blitsort.#1" 15 | -------------------------------------------------------------------------------- /glide/README.md: -------------------------------------------------------------------------------- 1 | Shim for benchmarking glidesort, a fast stable comparison sort in Rust. 2 | It has similar performance characteristics to fluxsort: it benchmarks 3 | substantially slower (~20%) on my CPU and just slightly slower on newer 4 | ones. This includes overhead for panic safety: the author estimates this 5 | at 10-15% and it wouldn't apply to a C version. I've decided not to use 6 | it in published benchmarks because of the extra complication in setting 7 | up Rust and the fact that it's not directly comparable to C. 8 | 9 | To benchmark, starting in repository root directory: 10 | 11 | ```sh 12 | # Nightly (this line, +nightly below) isn't required but is currently faster 13 | rustup update nightly 14 | # cargo fetches the package 15 | cd glide && cargo +nightly build --release && cd .. 16 | 17 | gcc -O3 -D GLIDESORT -D NOTEST bench.c -Lglide/target/release -lglide 18 | LD_LIBRARY_PATH=glide/target/release ./a.out l > res/r_glide.txt 19 | 20 | # rand.svg plus glidesort 21 | images/line.bqn res/r_{flux,wolf,ska_,glide,rh}.txt > images/rand_glide.svg 22 | ``` 23 | 24 | This last line assumes the other res/r_\*.txt files exist. See 25 | instructions for rand.svg in the main README. 26 | -------------------------------------------------------------------------------- /res/c_rh.txt: -------------------------------------------------------------------------------- 1 | Sorting 10,000 small-range 4-byte integers: rhsort 2 | Testing range 34: best: 1.288 avg: 1.329 ns/v 3 | Testing range 45: best: 1.292 avg: 1.352 ns/v 4 | Testing range 59: best: 1.280 avg: 1.325 ns/v 5 | Testing range 77: best: 1.270 avg: 1.305 ns/v 6 | Testing range 100: best: 1.329 avg: 1.425 ns/v 7 | Testing range 130: best: 1.331 avg: 1.433 ns/v 8 | Testing range 169: best: 1.327 avg: 1.432 ns/v 9 | Testing range 219: best: 1.333 avg: 1.427 ns/v 10 | Testing range 284: best: 1.333 avg: 1.414 ns/v 11 | Testing range 369: best: 1.364 avg: 1.441 ns/v 12 | Testing range 479: best: 1.387 avg: 1.455 ns/v 13 | Testing range 622: best: 1.418 avg: 1.494 ns/v 14 | Testing range 808: best: 1.455 avg: 1.530 ns/v 15 | Testing range 1050: best: 1.512 avg: 1.574 ns/v 16 | Testing range 1365: best: 1.730 avg: 1.773 ns/v 17 | Testing range 1774: best: 1.767 avg: 1.806 ns/v 18 | Testing range 2306: best: 1.815 avg: 1.845 ns/v 19 | Testing range 2997: best: 1.875 avg: 1.909 ns/v 20 | Testing range 3896: best: 1.958 avg: 1.996 ns/v 21 | Testing range 5064: best: 2.115 avg: 2.158 ns/v 22 | Testing range 6583: best: 2.284 avg: 2.320 ns/v 23 | Testing range 8557: best: 2.520 avg: 2.560 ns/v 24 | Testing range 11124: best: 2.799 avg: 2.852 ns/v 25 | Testing range 14461: best: 3.191 avg: 3.264 ns/v 26 | Testing range 18799: best: 3.696 avg: 3.812 ns/v 27 | Testing range 24438: best: 4.518 avg: 4.640 ns/v 28 | Testing range 31769: best: 5.555 avg: 5.714 ns/v 29 | Testing range 41299: best: 5.322 avg: 5.478 ns/v 30 | Testing range 53688: best: 5.323 avg: 5.543 ns/v 31 | Testing range 69794: best: 5.248 avg: 5.410 ns/v 32 | Testing range 90732: best: 5.610 avg: 5.771 ns/v 33 | Testing range 117951: best: 5.115 avg: 5.274 ns/v 34 | Testing range 153336: best: 5.254 avg: 5.381 ns/v 35 | -------------------------------------------------------------------------------- /res/c_flux.txt: -------------------------------------------------------------------------------- 1 | Sorting 10,000 small-range 4-byte integers: fluxsort 2 | Testing range 34: best: 6.883 avg: 7.422 ns/v 3 | Testing range 45: best: 7.464 avg: 7.989 ns/v 4 | Testing range 59: best: 7.890 avg: 8.506 ns/v 5 | Testing range 77: best: 8.515 avg: 9.002 ns/v 6 | Testing range 100: best: 9.243 avg: 9.643 ns/v 7 | Testing range 130: best: 9.881 avg: 10.333 ns/v 8 | Testing range 169: best: 10.594 avg: 11.037 ns/v 9 | Testing range 219: best: 11.438 avg: 11.871 ns/v 10 | Testing range 284: best: 12.282 avg: 12.817 ns/v 11 | Testing range 369: best: 13.032 avg: 13.566 ns/v 12 | Testing range 479: best: 13.751 avg: 14.304 ns/v 13 | Testing range 622: best: 14.998 avg: 15.545 ns/v 14 | Testing range 808: best: 17.483 avg: 18.317 ns/v 15 | Testing range 1050: best: 19.535 avg: 20.315 ns/v 16 | Testing range 1365: best: 20.632 avg: 21.454 ns/v 17 | Testing range 1774: best: 21.010 avg: 21.785 ns/v 18 | Testing range 2306: best: 21.127 avg: 21.931 ns/v 19 | Testing range 2997: best: 21.094 avg: 21.928 ns/v 20 | Testing range 3896: best: 21.067 avg: 21.733 ns/v 21 | Testing range 5064: best: 21.069 avg: 21.920 ns/v 22 | Testing range 6583: best: 21.090 avg: 21.913 ns/v 23 | Testing range 8557: best: 21.054 avg: 21.793 ns/v 24 | Testing range 11124: best: 20.986 avg: 21.905 ns/v 25 | Testing range 14461: best: 20.952 avg: 21.838 ns/v 26 | Testing range 18799: best: 20.965 avg: 21.637 ns/v 27 | Testing range 24438: best: 20.975 avg: 21.699 ns/v 28 | Testing range 31769: best: 21.019 avg: 21.715 ns/v 29 | Testing range 41299: best: 21.061 avg: 21.773 ns/v 30 | Testing range 53688: best: 21.007 avg: 21.861 ns/v 31 | Testing range 69794: best: 21.008 avg: 21.871 ns/v 32 | Testing range 90732: best: 20.947 avg: 21.817 ns/v 33 | Testing range 117951: best: 21.004 avg: 21.858 ns/v 34 | Testing range 153336: best: 20.935 avg: 21.805 ns/v 35 | -------------------------------------------------------------------------------- /res/c_quad.txt: -------------------------------------------------------------------------------- 1 | Sorting 10,000 small-range 4-byte integers: quadsort 2 | Testing range 34: best: 18.633 avg: 19.288 ns/v 3 | Testing range 45: best: 19.267 avg: 19.841 ns/v 4 | Testing range 59: best: 19.764 avg: 20.436 ns/v 5 | Testing range 77: best: 20.369 avg: 21.001 ns/v 6 | Testing range 100: best: 20.757 avg: 21.358 ns/v 7 | Testing range 130: best: 21.225 avg: 21.827 ns/v 8 | Testing range 169: best: 21.764 avg: 22.182 ns/v 9 | Testing range 219: best: 22.065 avg: 22.729 ns/v 10 | Testing range 284: best: 22.752 avg: 23.392 ns/v 11 | Testing range 369: best: 22.892 avg: 23.588 ns/v 12 | Testing range 479: best: 23.137 avg: 23.616 ns/v 13 | Testing range 622: best: 23.255 avg: 23.681 ns/v 14 | Testing range 808: best: 23.338 avg: 23.757 ns/v 15 | Testing range 1050: best: 23.432 avg: 24.004 ns/v 16 | Testing range 1365: best: 23.579 avg: 24.216 ns/v 17 | Testing range 1774: best: 23.664 avg: 24.342 ns/v 18 | Testing range 2306: best: 23.796 avg: 24.467 ns/v 19 | Testing range 2997: best: 23.828 avg: 24.507 ns/v 20 | Testing range 3896: best: 23.911 avg: 24.482 ns/v 21 | Testing range 5064: best: 23.945 avg: 24.383 ns/v 22 | Testing range 6583: best: 23.908 avg: 24.567 ns/v 23 | Testing range 8557: best: 23.900 avg: 24.420 ns/v 24 | Testing range 11124: best: 23.973 avg: 24.546 ns/v 25 | Testing range 14461: best: 23.995 avg: 24.574 ns/v 26 | Testing range 18799: best: 23.921 avg: 24.375 ns/v 27 | Testing range 24438: best: 23.961 avg: 24.403 ns/v 28 | Testing range 31769: best: 23.941 avg: 24.498 ns/v 29 | Testing range 41299: best: 23.962 avg: 24.649 ns/v 30 | Testing range 53688: best: 23.935 avg: 24.541 ns/v 31 | Testing range 69794: best: 23.930 avg: 24.419 ns/v 32 | Testing range 90732: best: 23.932 avg: 24.485 ns/v 33 | Testing range 117951: best: 23.986 avg: 24.633 ns/v 34 | Testing range 153336: best: 23.989 avg: 24.659 ns/v 35 | -------------------------------------------------------------------------------- /images/scatter.bqn: -------------------------------------------------------------------------------- 1 | #! /usr/bin/env bqn 2 | 3 | util ← •Import "util.bqn" 4 | Num‿Pos‿SVG‿Ge‿Text‿TSize‿Path‿Background‿Outline‿Legend‿Tick ← util 5 | Enc‿Elt‿At‿Rect ← util 6 | 7 | "Input filename required" ! 1≤≠•args 8 | input ← •wdpath •file.At ⊑•args 9 | 10 | ToNats ← ((>⟜«0⊸≤) / 0(0⊸≤××⟜10⊸+)`⊢) · 10⊸≤⊸(¬⊸×-⊣) -⟜'0' 11 | Parse ← { 12 | l ← •file.Lines 𝕩 13 | t ← ∾⟜" of "⊸∾´ Num ToNats ⊑l 14 | t ⋈ ¯1 (↓ ⋈ ⊏÷1e3˙) (¯1∾˜1+↕11) ⊏ ⍉ ∘‿14 ⥊ ToNats '.'⊸≠⊸/ ∾1↓l 15 | } 16 | ⟨title,samples‿time⟩ ← Parse input 17 | 18 | ylog ← 0 19 | width ← 512 20 | off ← 50‿40 ⋄ end ← 20‿46 21 | 22 | gr ← "stroke-width=1.2|font-size=14px|text-anchor=middle" 23 | 24 | Marker ← "circle" Elt "r"‿"2"∾"cx"‿"cy"≍˘Num 25 | Clip ← <"defs" Enc ("clipPath"At"id=clip") Enc Rect 26 | Dist ← { 27 | h ← 0(∾∾∾⟜(-⌽))2(24÷4+↓¬-⊸↓)𝕨 28 | p ← (∾○(¯1⊸↓)⟜⌽𝕨) ≍˘ 𝕩+h 29 | "Z" ∾˜ 'M'⌾⊑ ∾⥊ "L " ∾¨⎉1 Num p 30 | } 31 | 32 | Log ← ⋆⁼⌾(ylog⊑⊑‿⊢) 33 | win ← -˜`¨ bounds ← Log 30‿800 ⋈ 0⋈⌈´time 34 | dim ← width (⊣⋈×) 0.55 35 | out ← (-off)≍dim+off+end 36 | padb ← 1‿ylog × padi ← 10‿20 37 | Scale ← padi+(dim-padb+padi)× ({¬𝕏}⌾(1⊸⊑) {𝕩÷˜𝕨-˜⊢}´¨ win) {𝕎𝕩}¨ Log 38 | dat ← Scale ⟨<˘⍉5⌈samples, time⟩ 39 | 40 | •Out¨ (⥊out) SVG gr Ge ⟨ 41 | Clip 0‿0≍dim 42 | Background out 43 | <((TSize 18)∾Pos⟨width÷2,¯19⟩) Text "RH criterion, sampling "∾title 44 | ⟨ 45 | "Score" Text˜ Pos 0‿26+dim×0.5‿1 46 | "Time (ns / value)" Text˜ "transform"‿"rotate(-90)"∾Pos 0‿32-˜⌽dim×0‿¯0.5 47 | ⟩ 48 | Tick { 49 | off⇐0‿0 ⋄ dim⇐dim 50 | orient⇐"vh" 51 | RoundDown ← {(⊑∘⍋⟜𝕩-1˙)⊸⊑1‿2‿5×⌊⌾(10⋆⁼⊢)𝕩} 52 | tpos ⇐ Scale ticks ← ⟨ 53 | ⥊(10⋆↕4)×⌜∧1.5‿7∾1+↕5 54 | {ylog ? ⥊1‿10×⌜∧1.5‿7∾1+↕5 ; (RoundDown 6÷˜1⊑1⊑bounds)×1+↕15} 55 | ⟩ 56 | ttext ⇐ ⟨1e3⊸<◶Num‿("1e"∾'0'+·⌊0.5+10⋆⁼⊢)¨, Num⟩ {𝕎𝕩}¨ ticks 57 | } 58 | "clip-path=url(#clip)" Ge ⟨ 59 | "fill=#24851d" Ge (Marker 5⊸⊑⊸⋈)¨´ dat 60 | "fill=#176e10" Ge (≍"opacity"‿"0.6")⊸Path¨ Dist¨´ dat 61 | ⟩ 62 | Outline 0‿0≍dim 63 | ⟩ 64 | -------------------------------------------------------------------------------- /images/bar.bqn: -------------------------------------------------------------------------------- 1 | #! /usr/bin/env bqn 2 | 3 | util ← •Import "util.bqn" 4 | Num‿Pos‿SVG‿Ge‿Rect‿Text‿TSize‿Background‿Outline‿Legend‿Tick ← util 5 | 6 | "Input filename required" ! 1≤≠•args 7 | input ← •wdpath •file.At ⊑•args 8 | 9 | # Parse the file 10 | Parse ← { 11 | h‿d ← (⊑⋈2⊸↓) (∨`⌾⌽'|'⊸=)⊸/¨ ('|'=(⊑1⊸↑)¨)⊸/ •file.Lines 𝕩 12 | c ← +`'|'=h 13 | m ← ¬∧˝"| "∊˜>d 14 | g ← 1-˜m×c ⊏ +˝(1+↕∘≠)⊸× "Name"‿"Distribution"‿"Average" ∨´∘⍷⌜ c ⊔ h 15 | <˘ ⍉ (0<·≠¨⊏˘)⊸/ (∨`' '⊸≠)⊸/¨ > g⊸⊔¨d 16 | } 17 | name‿dist‿avg ← Parse input 18 | ! (≠∘⊑⥊¨·⊣⌜˜`⌾⌽⍷¨)⊸≡ ⊐¨dist‿name 19 | PD ← { 20 | t←-≠o←" order" ⋄ 𝕩↓˜↩t×o≡t↑𝕩 21 | Abbr ← { 22 | m←⊢´⊸<(0¨e)(≠`»≠⊢)b←𝕩(≠∘⊢↑⍷)˜e←𝕨 23 | (b≥m)/𝕩+(b∧m)×'.'-𝕩 24 | } 25 | (-´"Aa")⊸+⌾⊑ "ending" Abbr "om" Abbr⍟(⊑'%'∊⊢) 𝕩 26 | } 27 | names‿colors‿isRH ← util.ProcName ¯4 ↓¨ ⍷name 28 | dists ← PD¨⍷dist 29 | 30 | # Create the plot 31 | noMax ← ⥊ (dists∊"Asc. saw"‿"Desc. saw") ∧⌜ names∊⋈"pdqsort" # too slow 32 | max ← ⌈´ (¬noMax) / avg •BQN¨↩ 33 | 34 | nn‿nd ← ≠¨names‿dists 35 | bt←bs←7 ⋄ gs←12+nn×bs 36 | w ← 14 + (w0←100) + tw ← 6 + pw←400-6 37 | h ← 68 + h1 ← (h0← 56) + ph ← nd×gs 38 | wh ← w‿h 39 | 40 | _cBar ← { "stroke=#111" Ge colors "fill="⊸∾⊸𝔽¨ 𝕩 } 41 | 42 | TT ← < TSize⊸∾⟜(Pos(w0+pw÷2)⊸⋈)´⊸Text 43 | 44 | gh ← h0+gs×0.5+↕nd 45 | bh ← gh +⌜ bs × (↕-2÷˜-⟜1)nn 46 | bars ← nd‿nn⥊pw×avg÷max 47 | 48 | •Out¨ (0‿0∾wh) SVG "text-anchor=middle|font-size=14px|fill=#111" Ge ⟨ 49 | Background 0‿0≍wh 50 | ⟨20,h0-37⟩ TT "The Big Bad Wolfsort benchmark" 51 | ⟨15,h0-14⟩ TT "Sort 100,000 4-byte integers, average of 100" 52 | ⟨15,h1+31⟩ TT "Nanoseconds per element" 53 | ⟨11,h1+51⟩ TT "i5-6200U CPU @ 2.30GHz; compiled with g++ -O3" 54 | Tick { 55 | off⇐w0‿h0 ⋄ dim⇐tw‿ph 56 | orient⇐"v" 57 | ticks ← (↕1+⌊)⌾(÷⟜10) maxns ← 1e4×max 58 | tpos ⇐ ⟨w0+pw×ticks÷maxns⟩ 59 | ttext ⇐ ⟨Num ticks⟩ 60 | } 61 | "text-anchor=end" Ge ((w0-5)Pos∘⋈¨gh) Text¨ dists 62 | Ge _cBar <˘⍉ bh Rect∘{⟨w0,𝕨-bt÷2⟩≍𝕩‿bt}¨ bars 63 | Outline w0‿h0≍tw‿ph 64 | Legend { 65 | isRH ⇐ isRH ⋄ label ⇐ names 66 | width ⇐ 150 67 | place ⇐ ⟨w0+tw-10+width,h0+85⟩ 68 | pad ⇐ 2 ÷˜ spacing ⇐ 11+bs 69 | Samples ⇐ Rect⟜{10‿(𝕩-bt÷2)≍18‿bt} _cBar 70 | } 71 | ⟩ 72 | -------------------------------------------------------------------------------- /images/util.bqn: -------------------------------------------------------------------------------- 1 | # SVG utilities 2 | Enc ⇐ { 3 | DeNest ← {(3⌊≡)◶⟨!∘0,⋈,⊢,∾𝕊¨⟩ ⥊𝕩} 4 | open ← ∾⟨"<",𝕨,">"⟩ 5 | close← ∾⟨""⟩ 6 | l ← 1 < d←≡𝕩 7 | ∾ open ({" "⊸∾¨(∾DeNest¨)⍟(3≤d)⥊𝕩}⍟l 𝕩){𝕨‿𝕗‿𝕩}○(⥊∘<⍟l) close 8 | } 9 | At1 ← " " ∾ {∾⟨𝕨,"='",𝕩,"'"⟩}´ 10 | Attr ⇐ ∾⟜(∾ <∘At1⎉1) 11 | At ⇐ { 12 | _s ← {((+`׬)⊸-𝕗⊸=)⊸⊔} 13 | 𝕨 >⊘(∾⟜(∾At1¨)) '='_s¨ '|'_s 𝕩 14 | } 15 | Elt ⇐ {∾⟨"<",𝕩,"/>"⟩}Attr 16 | Num ⇐ ('¯' (⊢+=×'-'⊸-) (∧`4⥊⟜1⊸»'.'⊸≠)⊸/∘•Repr)⚇0 17 | Pos ⇐ ⟨"x","y"⟩ ≍˘ Num 18 | SVG ⇐ { 19 | a ← ⟨"viewBox",1↓∾' '∾¨Num 𝕨⟩∾"height"‿"width"≍˘Num 1.5×⌽2↓𝕨 20 | a ∾↩ "xmlns"‿"http://www.w3.org/2000/svg" 21 | ("svg" Attr a) Enc 𝕩 22 | } 23 | 24 | # Specific elements 25 | Ge ⇐ "g"⊸At⊸Enc 26 | Rdim ⇐ Pos∘⊏∾"width"‿"height"≍˘·Num⊢˝ 27 | Rect ⇐ "rect" Elt Rdim⊘(At⊸∾⟜Rdim) 28 | Path ⇐ "path" Elt "d"⊸⋈⊘(⊣∾"d"⋈⊢) 29 | Text ⇐ ("text"Attr"dy"‿"0.33em"⊸∾)⊸Enc 30 | TSize ⇐ "font-size"⋈Num∾"px"˙ 31 | 32 | # Graph components 33 | Background ⇐ <"fill=white" Rect ⊢ 34 | Outline ⇐ <"stroke=#111|fill=none" Rect ⊢ 35 | 36 | Legend ⇐ { 37 | place‿spacing‿pad‿width‿label‿isRH‿Samples ← 𝕩 38 | y ← pad+spacing×0.5+↕n←≠label 39 | dim ← ⟨width,(2×pad)+spacing×n⟩ 40 | rhs ← At"font-weight=bold|stroke-width=0.6|stroke=#211|fill=#24851d" 41 | ltr ← "transform=translate("∾")"∾˜∾⟜","⊸∾´Num place 42 | legend ← (ltr∾"|text-anchor=start|font-size=13px") Ge ⟨ 43 | "fill=white|stroke=#111" Rect 0‿0≍dim 44 | Samples y 45 | (rhs⊸∾¨⌾(isRH⊸/)36 Pos∘⋈¨y) Text¨ label 46 | ⟩ 47 | } 48 | 49 | Tick ⇐ { 50 | off‿dim‿orient‿tpos‿ttext ← 𝕩 51 | s ← 'h'=orient 52 | Lines ← { 53 | h ← 'h'=𝕨 54 | P ← ⌽⌾(2⊸↑)⍟h ∾⟜((¬h)⊏⍉off≍dim) 55 | ("path"Elt"d"⋈·∾("M "∾𝕨)∾¨Num∘P)¨ 𝕩 56 | } 57 | dist ← (s⊏dim)(-⌊⊢)tpos-s⊏off 58 | Filter ← ≤⟜dist/¨⊢ 59 | to ← 9.5‿¯6 + ⌽off+dim×0‿1 60 | Vals ← { "text-anchor=end" Ge⍟𝕨 (Pos 𝕨⌽⋈⟜(𝕨⊑to))⊸Text¨´ 𝕩 } 61 | ⟨ 62 | "stroke-width=0.3|stroke=#555" Ge orient Lines¨ 1 Filter tpos 63 | "font-size=11px" Ge s Vals¨ tpos ⋈¨○(0⊸Filter) ttext 64 | ⟩ 65 | } 66 | 67 | # Name info 68 | ProcName ⇐ { 69 | nam ← "rh" ‿"quad"‿"pdq" ‿"flux"‿"wolf"‿"ska_"‿"merge" 70 | col ← "#24851d"‿"#6ad"‿"#c7d"‿"#b9e"‿"#cc6"‿"#da4"‿"#69a" 71 | ce ← "#22c"‿"#c22"‿"#2a7"‿"#919" 72 | isRH ← 0 = ni ← nam ⊐ 𝕩 73 | ni +↩ (≠ce)|⊒⊸×ni=≠nam # Cycle through extra colors 74 | names ← isRH ⊣◶⟨⊢∾"sort"∾"_copy"/˜"ska_"⊸≡,"Robin Hood"⟩¨ 𝕩 75 | ⟨names, ni⊏col∾ce, isRH⟩ 76 | } 77 | -------------------------------------------------------------------------------- /res/r_flux.txt: -------------------------------------------------------------------------------- 1 | Sorting random 4-byte integers: fluxsort 2 | Testing size 34: best: 7.618 avg: 9.806 ns/v 3 | Testing size 45: best: 8.667 avg: 12.068 ns/v 4 | Testing size 59: best: 10.000 avg: 13.840 ns/v 5 | Testing size 77: best: 10.403 avg: 14.358 ns/v 6 | Testing size 100: best: 10.270 avg: 13.779 ns/v 7 | Testing size 130: best: 10.600 avg: 12.385 ns/v 8 | Testing size 169: best: 13.266 avg: 15.715 ns/v 9 | Testing size 219: best: 13.393 avg: 16.132 ns/v 10 | Testing size 284: best: 13.972 avg: 16.627 ns/v 11 | Testing size 369: best: 14.157 avg: 17.044 ns/v 12 | Testing size 479: best: 14.797 avg: 17.193 ns/v 13 | Testing size 622: best: 15.018 avg: 17.422 ns/v 14 | Testing size 808: best: 15.488 avg: 17.771 ns/v 15 | Testing size 1050: best: 16.094 avg: 18.206 ns/v 16 | Testing size 1365: best: 17.251 avg: 18.732 ns/v 17 | Testing size 1774: best: 17.368 avg: 19.084 ns/v 18 | Testing size 2306: best: 18.368 avg: 19.386 ns/v 19 | Testing size 2997: best: 18.709 avg: 19.581 ns/v 20 | Testing size 3896: best: 19.001 avg: 20.120 ns/v 21 | Testing size 5064: best: 19.777 avg: 20.628 ns/v 22 | Testing size 6583: best: 20.234 avg: 21.099 ns/v 23 | Testing size 8557: best: 20.707 avg: 21.437 ns/v 24 | Testing size 11124: best: 21.134 avg: 21.919 ns/v 25 | Testing size 14461: best: 21.638 avg: 22.350 ns/v 26 | Testing size 18799: best: 22.109 avg: 22.793 ns/v 27 | Testing size 24438: best: 22.635 avg: 23.270 ns/v 28 | Testing size 31769: best: 23.205 avg: 23.631 ns/v 29 | Testing size 41299: best: 23.581 avg: 24.119 ns/v 30 | Testing size 53688: best: 24.109 avg: 24.566 ns/v 31 | Testing size 69794: best: 24.588 avg: 25.058 ns/v 32 | Testing size 90732: best: 25.074 avg: 25.577 ns/v 33 | Testing size 117951: best: 25.601 avg: 26.065 ns/v 34 | Testing size 153336: best: 26.667 avg: 27.249 ns/v 35 | Testing size 199336: best: 27.392 avg: 27.842 ns/v 36 | Testing size 259136: best: 27.488 avg: 28.321 ns/v 37 | Testing size 336876: best: 28.252 avg: 28.757 ns/v 38 | Testing size 437938: best: 29.180 avg: 29.674 ns/v 39 | Testing size 569319: best: 29.771 avg: 30.171 ns/v 40 | Testing size 740114: best: 29.953 avg: 30.464 ns/v 41 | Testing size 962148: best: 30.785 avg: 31.070 ns/v 42 | Testing size 1250792: best: 31.275 avg: 31.535 ns/v 43 | Testing size 1626029: best: 32.116 avg: 32.482 ns/v 44 | Testing size 2113837: best: 32.613 avg: 32.966 ns/v 45 | Testing size 2747988: best: 33.104 avg: 33.408 ns/v 46 | Testing size 3572384: best: 33.909 avg: 33.909 ns/v 47 | Testing size 4644099: best: 34.214 avg: 34.214 ns/v 48 | Testing size 6037328: best: 34.973 avg: 34.973 ns/v 49 | Testing size 7848526: best: 35.687 avg: 35.687 ns/v 50 | Testing size 10000000: best: 35.739 avg: 35.739 ns/v 51 | -------------------------------------------------------------------------------- /res/r_rh.txt: -------------------------------------------------------------------------------- 1 | Sorting random 4-byte integers: rhsort 2 | Testing size 34: best: 5.735 avg: 8.851 ns/v 3 | Testing size 45: best: 5.378 avg: 8.040 ns/v 4 | Testing size 59: best: 5.695 avg: 7.752 ns/v 5 | Testing size 77: best: 5.299 avg: 6.936 ns/v 6 | Testing size 100: best: 4.700 avg: 6.632 ns/v 7 | Testing size 130: best: 5.092 avg: 6.080 ns/v 8 | Testing size 169: best: 4.462 avg: 5.412 ns/v 9 | Testing size 219: best: 5.279 avg: 5.854 ns/v 10 | Testing size 284: best: 4.496 avg: 5.081 ns/v 11 | Testing size 369: best: 3.957 avg: 4.586 ns/v 12 | Testing size 479: best: 4.766 avg: 5.107 ns/v 13 | Testing size 622: best: 4.135 avg: 4.499 ns/v 14 | Testing size 808: best: 3.774 avg: 4.137 ns/v 15 | Testing size 1050: best: 4.432 avg: 4.659 ns/v 16 | Testing size 1365: best: 3.962 avg: 4.204 ns/v 17 | Testing size 1774: best: 4.995 avg: 5.157 ns/v 18 | Testing size 2306: best: 4.376 avg: 4.576 ns/v 19 | Testing size 2997: best: 4.170 avg: 4.552 ns/v 20 | Testing size 3896: best: 4.953 avg: 5.110 ns/v 21 | Testing size 5064: best: 4.976 avg: 5.135 ns/v 22 | Testing size 6583: best: 5.747 avg: 5.920 ns/v 23 | Testing size 8557: best: 5.688 avg: 5.792 ns/v 24 | Testing size 11124: best: 6.283 avg: 6.401 ns/v 25 | Testing size 14461: best: 6.454 avg: 6.580 ns/v 26 | Testing size 18799: best: 6.694 avg: 6.800 ns/v 27 | Testing size 24438: best: 7.427 avg: 7.510 ns/v 28 | Testing size 31769: best: 7.331 avg: 7.530 ns/v 29 | Testing size 41299: best: 7.574 avg: 7.660 ns/v 30 | Testing size 53688: best: 7.751 avg: 8.051 ns/v 31 | Testing size 69794: best: 7.616 avg: 7.671 ns/v 32 | Testing size 90732: best: 8.010 avg: 8.060 ns/v 33 | Testing size 117951: best: 8.922 avg: 9.877 ns/v 34 | Testing size 153336: best: 9.107 avg: 9.395 ns/v 35 | Testing size 199336: best: 9.887 avg: 10.111 ns/v 36 | Testing size 259136: best: 14.072 avg: 15.917 ns/v 37 | Testing size 336876: best: 15.986 avg: 16.330 ns/v 38 | Testing size 437938: best: 23.473 avg: 25.419 ns/v 39 | Testing size 569319: best: 23.315 avg: 23.481 ns/v 40 | Testing size 740114: best: 24.058 avg: 24.203 ns/v 41 | Testing size 962148: best: 28.508 avg: 31.504 ns/v 42 | Testing size 1250792: best: 28.236 avg: 28.628 ns/v 43 | Testing size 1626029: best: 30.214 avg: 30.228 ns/v 44 | Testing size 2113837: best: 38.016 avg: 38.708 ns/v 45 | Testing size 2747988: best: 36.625 avg: 37.353 ns/v 46 | Testing size 3572384: best: 42.151 avg: 42.151 ns/v 47 | Testing size 4644099: best: 40.433 avg: 40.433 ns/v 48 | Testing size 6037328: best: 40.295 avg: 40.295 ns/v 49 | Testing size 7848526: best: 47.364 avg: 47.364 ns/v 50 | Testing size 10000000: best: 46.165 avg: 46.165 ns/v 51 | -------------------------------------------------------------------------------- /res/r_wolf.txt: -------------------------------------------------------------------------------- 1 | Sorting random 4-byte integers: wolfsort 2 | Testing size 34: best: 7.765 avg: 11.140 ns/v 3 | Testing size 45: best: 8.867 avg: 12.024 ns/v 4 | Testing size 59: best: 9.407 avg: 12.650 ns/v 5 | Testing size 77: best: 10.455 avg: 13.396 ns/v 6 | Testing size 100: best: 11.260 avg: 13.892 ns/v 7 | Testing size 130: best: 11.515 avg: 14.379 ns/v 8 | Testing size 169: best: 9.964 avg: 11.546 ns/v 9 | Testing size 219: best: 9.744 avg: 11.205 ns/v 10 | Testing size 284: best: 9.669 avg: 10.722 ns/v 11 | Testing size 369: best: 9.688 avg: 10.610 ns/v 12 | Testing size 479: best: 9.436 avg: 10.088 ns/v 13 | Testing size 622: best: 9.381 avg: 10.005 ns/v 14 | Testing size 808: best: 9.142 avg: 9.635 ns/v 15 | Testing size 1050: best: 8.988 avg: 9.523 ns/v 16 | Testing size 1365: best: 8.881 avg: 9.317 ns/v 17 | Testing size 1774: best: 9.112 avg: 9.429 ns/v 18 | Testing size 2306: best: 9.006 avg: 9.394 ns/v 19 | Testing size 2997: best: 9.185 avg: 9.520 ns/v 20 | Testing size 3896: best: 9.645 avg: 10.015 ns/v 21 | Testing size 5064: best: 10.312 avg: 10.619 ns/v 22 | Testing size 6583: best: 11.542 avg: 11.861 ns/v 23 | Testing size 8557: best: 16.287 avg: 16.767 ns/v 24 | Testing size 11124: best: 13.624 avg: 13.988 ns/v 25 | Testing size 14461: best: 14.644 avg: 14.931 ns/v 26 | Testing size 18799: best: 16.246 avg: 16.642 ns/v 27 | Testing size 24438: best: 17.465 avg: 17.734 ns/v 28 | Testing size 31769: best: 18.318 avg: 18.734 ns/v 29 | Testing size 41299: best: 19.471 avg: 19.949 ns/v 30 | Testing size 53688: best: 19.967 avg: 20.483 ns/v 31 | Testing size 69794: best: 20.717 avg: 21.419 ns/v 32 | Testing size 90732: best: 21.165 avg: 21.930 ns/v 33 | Testing size 117951: best: 22.006 avg: 22.973 ns/v 34 | Testing size 153336: best: 22.494 avg: 23.579 ns/v 35 | Testing size 199336: best: 23.237 avg: 24.157 ns/v 36 | Testing size 259136: best: 24.338 avg: 25.458 ns/v 37 | Testing size 336876: best: 25.389 avg: 26.586 ns/v 38 | Testing size 437938: best: 26.335 avg: 27.936 ns/v 39 | Testing size 569319: best: 27.179 avg: 28.246 ns/v 40 | Testing size 740114: best: 26.834 avg: 27.987 ns/v 41 | Testing size 962148: best: 27.431 avg: 28.640 ns/v 42 | Testing size 1250792: best: 29.468 avg: 30.532 ns/v 43 | Testing size 1626029: best: 34.102 avg: 34.917 ns/v 44 | Testing size 2113837: best: 35.843 avg: 36.498 ns/v 45 | Testing size 2747988: best: 35.044 avg: 35.066 ns/v 46 | Testing size 3572384: best: 36.831 avg: 36.831 ns/v 47 | Testing size 4644099: best: 36.983 avg: 36.983 ns/v 48 | Testing size 6037328: best: 37.910 avg: 37.910 ns/v 49 | Testing size 7848526: best: 37.946 avg: 37.946 ns/v 50 | Testing size 10000000: best: 38.272 avg: 38.272 ns/v 51 | -------------------------------------------------------------------------------- /res/s_quad.txt: -------------------------------------------------------------------------------- 1 | Sorting random 4-byte integers: quadsort 2 | Testing size 34: best: 7.529 avg: 9.966 ns/v 3 | Testing size 45: best: 8.556 avg: 11.957 ns/v 4 | Testing size 59: best: 9.915 avg: 13.801 ns/v 5 | Testing size 77: best: 10.636 avg: 14.144 ns/v 6 | Testing size 100: best: 10.550 avg: 13.704 ns/v 7 | Testing size 130: best: 10.600 avg: 12.366 ns/v 8 | Testing size 169: best: 11.101 avg: 13.121 ns/v 9 | Testing size 219: best: 12.890 avg: 14.872 ns/v 10 | Testing size 284: best: 13.613 avg: 15.402 ns/v 11 | Testing size 369: best: 14.371 avg: 16.044 ns/v 12 | Testing size 479: best: 14.839 avg: 16.669 ns/v 13 | Testing size 622: best: 15.206 avg: 16.222 ns/v 14 | Testing size 808: best: 16.016 avg: 17.109 ns/v 15 | Testing size 1050: best: 16.886 avg: 17.880 ns/v 16 | Testing size 1365: best: 17.881 avg: 18.847 ns/v 17 | Testing size 1774: best: 18.271 avg: 19.294 ns/v 18 | Testing size 2306: best: 18.720 avg: 19.471 ns/v 19 | Testing size 2997: best: 19.658 avg: 20.431 ns/v 20 | Testing size 3896: best: 20.838 avg: 21.750 ns/v 21 | Testing size 5064: best: 22.144 avg: 22.925 ns/v 22 | Testing size 6583: best: 23.335 avg: 24.068 ns/v 23 | Testing size 8557: best: 22.769 avg: 23.525 ns/v 24 | Testing size 11124: best: 24.638 avg: 25.329 ns/v 25 | Testing size 14461: best: 26.001 avg: 26.511 ns/v 26 | Testing size 18799: best: 26.702 avg: 27.220 ns/v 27 | Testing size 24438: best: 28.582 avg: 29.182 ns/v 28 | Testing size 31769: best: 29.192 avg: 29.654 ns/v 29 | Testing size 41299: best: 28.057 avg: 28.632 ns/v 30 | Testing size 53688: best: 29.843 avg: 30.288 ns/v 31 | Testing size 69794: best: 30.240 avg: 30.852 ns/v 32 | Testing size 90732: best: 32.071 avg: 32.604 ns/v 33 | Testing size 117951: best: 32.914 avg: 33.406 ns/v 34 | Testing size 153336: best: 32.167 avg: 32.973 ns/v 35 | Testing size 199336: best: 33.711 avg: 34.159 ns/v 36 | Testing size 259136: best: 35.261 avg: 35.734 ns/v 37 | Testing size 336876: best: 36.516 avg: 36.757 ns/v 38 | Testing size 437938: best: 37.420 avg: 37.823 ns/v 39 | Testing size 569319: best: 36.793 avg: 37.575 ns/v 40 | Testing size 740114: best: 39.087 avg: 39.346 ns/v 41 | Testing size 962148: best: 40.833 avg: 40.974 ns/v 42 | Testing size 1250792: best: 43.163 avg: 43.215 ns/v 43 | Testing size 1626029: best: 44.004 avg: 44.013 ns/v 44 | Testing size 2113837: best: 43.169 avg: 43.311 ns/v 45 | Testing size 2747988: best: 44.284 avg: 44.309 ns/v 46 | Testing size 3572384: best: 46.227 avg: 46.227 ns/v 47 | Testing size 4644099: best: 47.130 avg: 47.130 ns/v 48 | Testing size 6037328: best: 49.003 avg: 49.003 ns/v 49 | Testing size 7848526: best: 49.963 avg: 49.963 ns/v 50 | Testing size 10000000: best: 48.812 avg: 48.812 ns/v 51 | -------------------------------------------------------------------------------- /res/r_ska_.txt: -------------------------------------------------------------------------------- 1 | Sorting random 4-byte integers: ska_sort_copy 2 | Testing size 34: best: 17.206 avg: 24.087 ns/v 3 | Testing size 45: best: 17.644 avg: 24.946 ns/v 4 | Testing size 59: best: 20.254 avg: 26.434 ns/v 5 | Testing size 77: best: 20.325 avg: 27.922 ns/v 6 | Testing size 100: best: 23.390 avg: 29.486 ns/v 7 | Testing size 130: best: 9.523 avg: 9.751 ns/v 8 | Testing size 169: best: 8.574 avg: 8.807 ns/v 9 | Testing size 219: best: 7.854 avg: 8.058 ns/v 10 | Testing size 284: best: 7.359 avg: 7.560 ns/v 11 | Testing size 369: best: 6.930 avg: 7.145 ns/v 12 | Testing size 479: best: 6.597 avg: 6.846 ns/v 13 | Testing size 622: best: 6.338 avg: 6.548 ns/v 14 | Testing size 808: best: 6.147 avg: 6.363 ns/v 15 | Testing size 1050: best: 5.972 avg: 6.192 ns/v 16 | Testing size 1365: best: 5.862 avg: 6.066 ns/v 17 | Testing size 1774: best: 5.790 avg: 5.999 ns/v 18 | Testing size 2306: best: 5.788 avg: 5.975 ns/v 19 | Testing size 2997: best: 5.954 avg: 6.171 ns/v 20 | Testing size 3896: best: 6.344 avg: 6.755 ns/v 21 | Testing size 5064: best: 7.026 avg: 7.373 ns/v 22 | Testing size 6583: best: 7.728 avg: 8.129 ns/v 23 | Testing size 8557: best: 8.136 avg: 8.450 ns/v 24 | Testing size 11124: best: 8.354 avg: 8.645 ns/v 25 | Testing size 14461: best: 8.267 avg: 8.495 ns/v 26 | Testing size 18799: best: 8.468 avg: 8.718 ns/v 27 | Testing size 24438: best: 8.415 avg: 8.608 ns/v 28 | Testing size 31769: best: 8.490 avg: 8.717 ns/v 29 | Testing size 41299: best: 8.478 avg: 8.693 ns/v 30 | Testing size 53688: best: 8.840 avg: 9.025 ns/v 31 | Testing size 69794: best: 8.684 avg: 8.896 ns/v 32 | Testing size 90732: best: 8.943 avg: 9.176 ns/v 33 | Testing size 117951: best: 9.098 avg: 9.256 ns/v 34 | Testing size 153336: best: 9.240 avg: 9.619 ns/v 35 | Testing size 199336: best: 9.344 avg: 9.726 ns/v 36 | Testing size 259136: best: 9.973 avg: 10.349 ns/v 37 | Testing size 336876: best: 9.614 avg: 9.956 ns/v 38 | Testing size 437938: best: 10.083 avg: 10.554 ns/v 39 | Testing size 569319: best: 10.087 avg: 10.637 ns/v 40 | Testing size 740114: best: 10.229 avg: 10.824 ns/v 41 | Testing size 962148: best: 10.476 avg: 11.114 ns/v 42 | Testing size 1250792: best: 10.577 avg: 11.440 ns/v 43 | Testing size 1626029: best: 11.083 avg: 11.868 ns/v 44 | Testing size 2113837: best: 11.210 avg: 12.032 ns/v 45 | Testing size 2747988: best: 11.160 avg: 12.022 ns/v 46 | Testing size 3572384: best: 12.902 avg: 12.902 ns/v 47 | Testing size 4644099: best: 12.945 avg: 12.945 ns/v 48 | Testing size 6037328: best: 13.090 avg: 13.090 ns/v 49 | Testing size 7848526: best: 13.061 avg: 13.061 ns/v 50 | Testing size 10000000: best: 13.053 avg: 13.053 ns/v 51 | -------------------------------------------------------------------------------- /res/s_merge.txt: -------------------------------------------------------------------------------- 1 | Sorting small-range plus outlier: mergesort 2 | Testing size 34: best: 23.235 avg: 31.262 ns/v 3 | Testing size 45: best: 25.133 avg: 32.233 ns/v 4 | Testing size 59: best: 26.288 avg: 32.959 ns/v 5 | Testing size 77: best: 28.688 avg: 34.394 ns/v 6 | Testing size 100: best: 30.660 avg: 36.105 ns/v 7 | Testing size 130: best: 33.723 avg: 38.665 ns/v 8 | Testing size 169: best: 35.645 avg: 40.696 ns/v 9 | Testing size 219: best: 38.393 avg: 43.171 ns/v 10 | Testing size 284: best: 41.852 avg: 46.270 ns/v 11 | Testing size 369: best: 44.848 avg: 48.735 ns/v 12 | Testing size 479: best: 47.541 avg: 51.375 ns/v 13 | Testing size 622: best: 50.659 avg: 53.875 ns/v 14 | Testing size 808: best: 53.597 avg: 56.206 ns/v 15 | Testing size 1050: best: 57.669 avg: 59.706 ns/v 16 | Testing size 1365: best: 58.693 avg: 60.684 ns/v 17 | Testing size 1774: best: 61.284 avg: 63.286 ns/v 18 | Testing size 2306: best: 63.893 avg: 66.061 ns/v 19 | Testing size 2997: best: 65.908 avg: 67.834 ns/v 20 | Testing size 3896: best: 68.160 avg: 69.856 ns/v 21 | Testing size 5064: best: 69.762 avg: 71.831 ns/v 22 | Testing size 6583: best: 71.355 avg: 72.942 ns/v 23 | Testing size 8557: best: 73.817 avg: 75.363 ns/v 24 | Testing size 11124: best: 74.161 avg: 75.814 ns/v 25 | Testing size 14461: best: 75.180 avg: 76.422 ns/v 26 | Testing size 18799: best: 76.592 avg: 77.715 ns/v 27 | Testing size 24438: best: 76.960 avg: 78.098 ns/v 28 | Testing size 31769: best: 77.534 avg: 78.752 ns/v 29 | Testing size 41299: best: 78.939 avg: 80.070 ns/v 30 | Testing size 53688: best: 79.485 avg: 80.547 ns/v 31 | Testing size 69794: best: 80.960 avg: 81.894 ns/v 32 | Testing size 90732: best: 80.931 avg: 81.817 ns/v 33 | Testing size 117951: best: 81.347 avg: 82.302 ns/v 34 | Testing size 153336: best: 83.154 avg: 83.694 ns/v 35 | Testing size 199336: best: 83.605 avg: 84.310 ns/v 36 | Testing size 259136: best: 84.647 avg: 85.115 ns/v 37 | Testing size 336876: best: 86.580 avg: 87.102 ns/v 38 | Testing size 437938: best: 86.799 avg: 87.128 ns/v 39 | Testing size 569319: best: 89.159 avg: 89.747 ns/v 40 | Testing size 740114: best: 88.959 avg: 89.719 ns/v 41 | Testing size 962148: best: 90.549 avg: 91.260 ns/v 42 | Testing size 1250792: best: 94.507 avg: 95.048 ns/v 43 | Testing size 1626029: best: 95.677 avg: 96.087 ns/v 44 | Testing size 2113837: best: 98.661 avg: 99.071 ns/v 45 | Testing size 2747988: best: 96.722 avg: 97.141 ns/v 46 | Testing size 3572384: best: 98.763 avg: 98.763 ns/v 47 | Testing size 4644099: best:100.434 avg:100.434 ns/v 48 | Testing size 6037328: best: 99.634 avg: 99.634 ns/v 49 | Testing size 7848526: best:100.360 avg:100.360 ns/v 50 | Testing size 10000000: best:102.776 avg:102.776 ns/v 51 | -------------------------------------------------------------------------------- /images/gif.bqn: -------------------------------------------------------------------------------- 1 | #! /usr/bin/env bqn 2 | 3 | # Gif utils 4 | EncodeLZW ← { 5 | dmax ← 2⋆12 6 | k←𝕨 ⋄ d←↕0 ⋄ r←↕0 7 | Add ← {c𝕊w: 8 | i ← d ⊑∘⊐ wc ← (dmax×w)+c 9 | { i<≠d ? k+i ; 10 | { dmax>k+≠d ? d ∾↩ wc ; @ } 11 | r ∾↩ w 12 | c 13 | } 14 | } 15 | r ∾ Add´⌽𝕩 16 | } 17 | 18 | EncPixels ← { 19 | c ← 2⋆𝕨 20 | d ← (c+2) EncodeLZW 𝕩 21 | Pack ← {⥊⍉ >2|⌊∘÷⟜2⍟(↕𝕨) 𝕩} 22 | s ← c∾d∾c+1 23 | m ← 12⌊⌈2⋆⁼(c+1)+↕≠s 24 | 2⊸×⊸+˜˝ ⍉↑‿8⥊∾ (𝕨+↕∘≠)⊸(Pack¨) (m-𝕨) ⊔ s 25 | } 26 | SplitBlocks ← { 27 | 0 ∾˜ ∾ ≠⊸∾¨ (255⌊∘÷˜↕∘≠)⊸⊔ 𝕩 28 | } 29 | 30 | # GIF with colors 𝕨 and list of delay‿pos‿inds 𝕩 31 | EncodeGIF ← { colors 𝕊 frames: 32 | size ← ⌽≢2⊑⊑frames 33 | cc ← 2⋆1+↕8 34 | cf ← 1⌈cc ⊑∘⍋ 1-˜≠colors 35 | "Too many colors" ! cf<≠cc 36 | colors ↑˜↩ cf⊑cc 37 | 38 | Enc2 ← ∾256(|⋈⌊∘÷˜)¨⊢ 39 | Block ← {delay‿pos‿inds: ∾⟨ 40 | # Graphic control extension 41 | 33‿249, 4, 4, Enc2 delay, 255, 0 42 | # Image descriptor 43 | 44 44 | Enc2 pos∾○⌽≢inds 45 | 0 # No local colors 46 | 1+cf 47 | SplitBlocks (1+cf) EncPixels ⥊inds 48 | ⟩} 49 | 50 | ∾⟨ 51 | # Bit depth, background, color list 52 | "GIF89a"-@, Enc2 size, 128+17×cf, 0, 0, ⥊colors 53 | 54 | # For animation 55 | 33‿255, 11, "NETSCAPE2.0"-@, 3, 1, 255‿255, 0 56 | 57 | ∾ Block¨ frames 58 | 59 | 59 # End 60 | ⟩ 61 | } 62 | 63 | # Draw the algorithm 64 | col ← >⟨ # Colors 65 | 0‿0‿0, # 0 Black 66 | 16‿3‿230, 8‿7‿58, # 1 2 Blue 67 | 45‿166‿36, 12‿48‿10, # 3 4 Green 68 | 137‿37‿14 # 5 Orange 69 | ⟩ 70 | w‿n‿h‿p ← 260‿90‿60‿10 ⋄ e←3 ⋄ outl←5 # Parameters 71 | hh ← p+2×h 72 | Dist ← ⌊ (h÷14) × ·((1.5×·•math.Sin (3.5×π)⊸×) + 15×ט) ÷⟜h 73 | bar ← 3 + Dist n (•MakeRand 0).Range h 74 | Seq←⌽∘↕⊸(<⌜) ⋄ EE←(⋈˜i-j0?⟨⟩; 87 | j0↩{0<𝕩?∞≠(𝕩-1)⊑buf?𝕊𝕩-1;𝕩}j0 ⋄ c←i-j0 88 | thr↩blk 89 | c⌊⌾(÷⟜blk)↩ 90 | st←≠steal 91 | buf (∞¨⊣{steal∾↩𝕩})⌾((j0+↕c)⊸⊏)↩ 92 | ⟨2, j0⋈hh-m, (e×m‿c)⥊0 ⟩‿⟨20, st⋈0, EE 5×h Seq (-c)↑steal⟩ 93 | } 94 | } 95 | updates ← ∾ <∘Insert˘ ⍉>(↕≠bar)‿bp‿bar 96 | 97 | # Filter buffer 98 | Filter ← {d𝕊f‿t‿b: s←e×b‿1 99 | ⟨2, f⋈hh-b, s⥊4⟩‿⟨d, t⋈0, (-e×h)↑s⥊3⟩ 100 | } 101 | bi ← /∞≠buf 102 | updates ∾↩ ⥊ (1+(≠buf)(«-⊢)bi) Filter˘ ⍉>⟨bi,bar(-+↕∘⊢)○≠bi,bi⊏buf⟩ 103 | 104 | # Merge 105 | arr ← steal∾bi⊏buf 106 | Merge ← {c𝕊i‿l‿n: 107 | sl←@ ⋄ arr ∧∘{sl↩𝕩}⌾(n↑i↓⊢)↩ 108 | cp ← ⟨20, 0⋈h+p, EE 5×h Seq l↑sl⟩ 109 | r ← (l⊸≤⊸(i⊸×⊸+⋈¨(h+p)׬∘⊣) ↕n) ⋈¨ {EE(-h)↑𝕩‿1⥊4}¨ sl 110 | (⋈cp) ∾ ⥊ (sl⍋⊸⊏2∾¨r) ≍˘ (i+↕n) {⟨4, 𝕨‿0, EE(-h)↑𝕩‿1⥊c⟩}¨ ∧sl 111 | } 112 | Ms ← ∾ (↕∘⌈⌾(2⋆⁼⊢))⊸({d←2×𝕨⋄(⊢∾𝕨‿d⌊𝕩⊸-)¨↕∘⌈⌾(÷⟜d)𝕩}¨) 113 | updates ∾↩ ∾ ((3«5¨)Merge¨⊢) (Ms⌾(÷⟜blk) ≠steal) ∾ ⋈0∾≠¨steal‿arr 114 | 115 | updates 80⌾⊑⌾(¯1⊸⊑)↩ 116 | updates (outl+e×⌽)⌾(1⊸⊑)¨↩ 117 | 118 | "robinhood.gif" •file.Bytes col EncodeGIF ⟨30,0‿0,ind0⟩ <⊸∾ updates 119 | -------------------------------------------------------------------------------- /images/line.bqn: -------------------------------------------------------------------------------- 1 | #! /usr/bin/env bqn 2 | 3 | util ← •Import "util.bqn" 4 | Num‿Pos‿SVG‿Ge‿Text‿TSize‿Path‿Background‿Outline‿Legend‿Tick ← util 5 | 6 | files‿opts ← 2 ↑ ('-'=⊑¨)⊸⊔ •args ⋄ opts 1⊸↓¨↩ 7 | GetOpt ← { l←≠𝕩 ⋄ (l+1)↓¨(𝕩≡l↑⊢)¨⊸/opts } 8 | input ← •wdpath⊸•file.At¨ files 9 | n ← ≠input 10 | 11 | titles ← "Sorting "⊸∾¨ ⟨ 12 | "random 4-byte integers" 13 | "small-range plus outlier" 14 | "10,000 small-range 4-byte integers" 15 | ⟩ 16 | 17 | ToNats ← ((>⟜«0⊸≤) / 0(0⊸≤××⟜10⊸+)`⊢) · 10⊸≤⊸(¬⊸×-⊣) -⟜'0' 18 | ParseHead ← { 19 | title‿name ⇐ (+`':'⊸=)⊸⊔ 𝕩 20 | name ↩ (-4+5×"ska_sort_copy"⊸≡)⊸↓2↓name 21 | } 22 | ParsePath ← { # Old style with no header and info in the filename 23 | name ⇐ (»∨`)∘=⟜'_'⊸/ fn ← ∧`∘≠⟜'.'⊸/ •file.Name 𝕩 24 | title ⇐ (⊑"sc"⊐⊑fn) ⊑ 1⌽titles 25 | } 26 | Read ← { 27 | l ← •file.Lines f←𝕩 28 | info ← {'S'≡⊑𝕩?l↓˜↩1⋄ParseHead𝕩; ParsePath f} ⊑l 29 | pr ← "profile" ∨´∘⍷⊑l 30 | c ← { ¬pr ? 3 ; m←⌈´n←1+´¨'.'=l ⋄ l∾¨↩⥊∘/⟜(≍" 0")¨m-n ⋄ m } 31 | tab ← ∘‿c ⥊ ToNats '.'⊸≠⊸/ ∾l 32 | info ⋈ pr◶⟨0‿2⊸⊏, ⊏∾·+`3⊸↓⟩ ⍉ ÷⟜1e3⌾(1↓⍉) 4 ↓ tab 33 | } 34 | 35 | info‿dat ← <˘⍉> Read¨ input 36 | x ← ⊏⊑dat 37 | ! ∧´(x≡⊏)¨1↓dat 38 | 39 | ! 1≥+´∊ 0⊸≠⊸/ tind ← titles ⊐ {𝕩.title}¨ info 40 | type ← ⌈´ tind 41 | title ← ((⊑tind⊐type) ⊑ info).title 42 | 43 | xr ← 2=type # If x indicates range instead of length 44 | hasparts ← ∧´1=yn←1-˜≠¨dat # If there's any profiling information 45 | ylog ← hasparts # Don't use log-scale y to show parts 46 | {ylog↩"0"≢𝕩}¨ GetOpt "ylog" 47 | 48 | x (2⋆13)⊸÷⍟xr↩ 49 | 50 | names‿colors‿rhf ← util.ProcName {𝕩.name}¨ info 51 | col ← "stroke"⊸⋈¨ colors 52 | 53 | width ← 512 54 | off ← 50‿40 ⋄ end ← 20‿46 55 | 56 | gr ← "stroke-width=1.2|font-size=14px|text-anchor=middle" 57 | TT ← < TSize⊸∾⟜(Pos(width÷2)⊸⋈)´⊸Text 58 | 59 | Log ← ⋆⁼⌾(ylog⊑⊑‿⊢) 60 | win ← -˜`¨ bounds ← 0⌾(⊑1⊑⊢)⍟(¬ylog) (⌊´≍⌈´)∘⥊¨ Log xy ← ⟨x, ∾1↓¨dat⟩ 61 | dim ← width (⊣⋈×) 0.55 62 | out ← (-off)≍dim+off+end 63 | padb ← ylog × padi ← 0‿20 64 | Scale ← padi+(dim-padb+padi)× ({¬𝕏}⌾(1⊸⊑) {𝕩÷˜𝕨-˜⊢}´¨ win) {𝕎𝕩}¨ Log 65 | line ← (<≍˘)⎉1○Num´ Scale xy 66 | lstyle ← (((↓≍"stroke-width"‿"1.4")⊏˜∊⌾⌽) ∾¨ ⊏⟜col) /yn 67 | 68 | •Out¨ (⥊out) SVG gr Ge ⟨ 69 | Background out 70 | ⟨18,¯19⟩ TT title 71 | ⟨ 72 | (xr⊑"Size"‿"Density (length / range)") Text˜ Pos 0‿26+dim×0.5‿1 73 | "Time (ns / value)" Text˜ "transform"‿"rotate(-90)"∾Pos 0‿32-˜⌽dim×0‿¯0.5 74 | ⟩ 75 | Tick { 76 | off⇐0‿0 ⋄ dim⇐dim 77 | orient⇐"vh" 78 | RoundDown ← {(⊑∘⍋⟜𝕩-1˙)⊸⊑1‿2‿5×⌊⌾(10⋆⁼⊢)𝕩} 79 | tpos ⇐ Scale ticks ← ⟨ 80 | {xr ? 2⋆10-↕20 ; 10⋆2+↕6} 81 | {ylog ? ⥊1‿10×⌜∧1.5‿7∾1+↕5 ; (RoundDown 6÷˜1⊑1⊑bounds)×1+↕15} 82 | ⟩ 83 | ttext ⇐ ⟨1e3⊸<◶Num‿("1e"∾'0'+·⌊0.5+10⋆⁼⊢)¨, Num⟩ {𝕎𝕩}¨ ticks 84 | } 85 | { hasparts?⟨⟩; 86 | Area ← ∾∘⥊ ("M "∾"HV"»"L "⎉1) ∾¨ (Num dim-padb)⊸∾ 87 | st ← "opacity"‿"0.2"⊸≍¨(yn-1)/"fill"⊸⋈¨colors 88 | "stroke=none" Ge st Path⟜Area¨ (¬∊/yn)/line 89 | }∾⟨ 90 | "stroke-width=2.6|fill=none" Ge lstyle Path⟜('M'⌾⊑∘∾·⥊ "L "∾¨⎉1⊢)¨ line 91 | ⟩ 92 | Outline 0‿0≍dim 93 | Legend { 94 | label ⇐ names ⋄ isRH ⇐ rhf 95 | place ⇐ 10‿(xr⊑6‿208) 96 | width ⇐ 134 ⋄ spacing ⇐ 18 ⋄ pad ⇐ 6 97 | Samples ⇐ "stroke-width=2.6" Ge col ≍⊸Path⟜(∾"M h"∾¨Num)¨ (8∾∾⟜22)¨ 98 | } 99 | ⟩ 100 | -------------------------------------------------------------------------------- /res/rp_rh.txt: -------------------------------------------------------------------------------- 1 | Sorting random 4-byte integers: rhsort 2 | Testing size 34: best: 12.676 avg: 15.802 ns/v profile 1.223 2.751 3.925 3.725 3 | Testing size 45: best: 9.978 avg: 13.188 ns/v profile 0.998 2.097 4.056 2.864 4 | Testing size 59: best: 9.797 avg: 11.720 ns/v profile 0.868 2.099 2.907 3.450 5 | Testing size 77: best: 8.221 avg: 9.899 ns/v profile 0.719 1.643 3.048 2.640 6 | Testing size 100: best: 7.000 avg: 8.901 ns/v profile 0.672 1.362 2.866 2.588 7 | Testing size 130: best: 6.892 avg: 7.919 ns/v profile 0.597 1.201 2.351 2.668 8 | Testing size 169: best: 5.811 avg: 6.759 ns/v profile 0.537 0.914 2.402 2.073 9 | Testing size 219: best: 6.292 avg: 6.925 ns/v profile 0.502 0.910 1.865 2.991 10 | Testing size 284: best: 5.331 avg: 5.950 ns/v profile 0.473 0.722 1.923 2.340 11 | Testing size 369: best: 4.650 avg: 5.257 ns/v profile 0.451 0.543 2.098 1.787 12 | Testing size 479: best: 5.223 avg: 5.651 ns/v profile 0.436 0.604 1.676 2.637 13 | Testing size 622: best: 4.490 avg: 4.943 ns/v profile 0.427 0.489 1.763 2.033 14 | Testing size 808: best: 4.019 avg: 4.444 ns/v profile 0.408 0.365 1.929 1.568 15 | Testing size 1050: best: 4.650 avg: 4.960 ns/v profile 0.411 0.455 1.620 2.333 16 | Testing size 1365: best: 4.136 avg: 4.427 ns/v profile 0.410 0.366 1.736 1.806 17 | Testing size 1774: best: 5.106 avg: 5.343 ns/v profile 0.401 0.598 1.501 2.762 18 | Testing size 2306: best: 4.435 avg: 4.631 ns/v profile 0.394 0.454 1.601 2.120 19 | Testing size 2997: best: 4.032 avg: 4.241 ns/v profile 0.395 0.341 1.819 1.632 20 | Testing size 3896: best: 4.925 avg: 5.145 ns/v profile 0.388 0.521 1.695 2.504 21 | Testing size 5064: best: 4.537 avg: 4.708 ns/v profile 0.386 0.402 1.968 1.923 22 | Testing size 6583: best: 5.618 avg: 5.825 ns/v profile 0.386 0.628 1.800 2.989 23 | Testing size 8557: best: 5.205 avg: 5.363 ns/v profile 0.382 0.463 2.221 2.279 24 | Testing size 11124: best: 5.536 avg: 5.730 ns/v profile 0.381 0.353 3.221 1.761 25 | Testing size 14461: best: 6.231 avg: 6.420 ns/v profile 0.394 0.580 2.705 2.730 26 | Testing size 18799: best: 6.397 avg: 6.583 ns/v profile 0.388 0.439 3.656 2.091 27 | Testing size 24438: best: 7.379 avg: 7.580 ns/v profile 0.388 0.343 5.231 1.611 28 | Testing size 31769: best: 7.207 avg: 7.470 ns/v profile 0.390 0.643 3.943 2.488 29 | Testing size 41299: best: 7.625 avg: 7.813 ns/v profile 0.386 0.414 5.082 1.926 30 | Testing size 53688: best: 7.669 avg: 8.099 ns/v profile 0.393 0.861 3.903 2.939 31 | Testing size 69794: best: 7.594 avg: 7.754 ns/v profile 0.388 0.503 4.625 2.235 32 | Testing size 90732: best: 8.009 avg: 8.232 ns/v profile 0.391 0.396 5.682 1.760 33 | Testing size 117951: best: 8.824 avg: 9.917 ns/v profile 0.409 1.255 5.403 2.848 34 | Testing size 153336: best: 8.860 avg: 9.167 ns/v profile 0.508 0.590 5.938 2.128 35 | Testing size 199336: best: 9.884 avg: 10.178 ns/v profile 0.522 0.506 7.454 1.694 36 | Testing size 259136: best: 14.119 avg: 15.761 ns/v profile 0.534 2.185 9.864 3.176 37 | Testing size 336876: best: 15.756 avg: 16.095 ns/v profile 0.622 1.133 11.570 2.768 38 | Testing size 437938: best: 23.386 avg: 25.317 ns/v profile 0.709 4.316 15.254 5.036 39 | Testing size 569319: best: 23.283 avg: 23.387 ns/v profile 0.718 2.033 16.569 4.064 40 | Testing size 740114: best: 23.855 avg: 24.149 ns/v profile 0.691 1.600 18.717 3.140 41 | Testing size 962148: best: 28.410 avg: 31.255 ns/v profile 0.723 5.794 19.649 5.088 42 | Testing size 1250792: best: 27.966 avg: 28.086 ns/v profile 0.736 2.173 21.254 3.923 43 | Testing size 1626029: best: 30.032 avg: 30.092 ns/v profile 0.703 1.691 24.815 2.883 44 | Testing size 2113837: best: 37.846 avg: 38.499 ns/v profile 0.702 9.814 23.258 4.725 45 | Testing size 2747988: best: 37.102 avg: 37.493 ns/v profile 0.669 7.296 25.984 3.543 46 | Testing size 3572384: best: 41.773 avg: 41.773 ns/v profile 0.578 9.875 25.680 5.640 47 | Testing size 4644099: best: 40.169 avg: 40.169 ns/v profile 0.533 7.464 28.043 4.128 48 | Testing size 6037328: best: 40.208 avg: 40.208 ns/v profile 0.508 5.637 31.098 2.964 49 | Testing size 7848526: best: 47.442 avg: 47.442 ns/v profile 0.519 8.345 34.085 4.493 50 | Testing size 10000000: best: 45.791 avg: 45.791 ns/v profile 0.507 6.505 35.449 3.330 51 | -------------------------------------------------------------------------------- /images/range.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | Sorting 10,000 small-range 4-byte integers 5 | Density (length / range) 6 | Time (ns / value) 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 64 32 | 32 33 | 16 34 | 8 35 | 4 36 | 2 37 | 1 38 | 0.5 39 | 0.25 40 | 0.125 41 | 0.062 42 | 43 | 1.5 44 | 2 45 | 3 46 | 4 47 | 5 48 | 7 49 | 10 50 | 15 51 | 20 52 | 30 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | quadsort 69 | fluxsort 70 | Robin Hood 71 | 72 | 73 | 74 | -------------------------------------------------------------------------------- /res/sp_rh.txt: -------------------------------------------------------------------------------- 1 | Sorting small-range plus outlier: rhsort 2 | Testing size 34: best: 34.206 avg: 39.403 ns/v profile 1.221 2.528 24.307 3.625 2.854 3 | Testing size 45: best: 31.356 avg: 36.714 ns/v profile 0.996 2.040 21.283 4.237 4.500 4 | Testing size 59: best: 29.627 avg: 35.244 ns/v profile 0.894 1.568 19.497 3.246 7.246 5 | Testing size 77: best: 31.961 avg: 38.138 ns/v profile 0.752 1.799 18.738 4.305 10.382 6 | Testing size 100: best: 31.600 avg: 37.501 ns/v profile 0.688 1.385 17.749 3.320 12.693 7 | Testing size 130: best: 32.162 avg: 37.746 ns/v profile 0.608 1.071 16.930 2.574 15.276 8 | Testing size 169: best: 33.030 avg: 38.551 ns/v profile 0.563 1.041 15.855 3.619 16.470 9 | Testing size 219: best: 33.142 avg: 38.908 ns/v profile 0.524 0.828 15.461 2.813 18.514 10 | Testing size 284: best: 35.271 avg: 40.187 ns/v profile 0.495 0.670 15.052 2.180 21.204 11 | Testing size 369: best: 38.509 avg: 42.966 ns/v profile 0.471 0.762 14.893 3.208 23.168 12 | Testing size 479: best: 40.152 avg: 44.573 ns/v profile 0.447 0.593 14.643 2.481 26.039 13 | Testing size 622: best: 44.762 avg: 48.887 ns/v profile 0.435 0.672 14.643 3.719 29.115 14 | Testing size 808: best: 46.700 avg: 50.787 ns/v profile 0.426 0.516 14.648 2.866 32.100 15 | Testing size 1050: best: 50.026 avg: 54.342 ns/v profile 0.407 0.408 15.111 2.206 36.020 16 | Testing size 1365: best: 53.384 avg: 57.000 ns/v profile 0.397 0.612 15.038 3.353 37.455 17 | Testing size 1774: best: 54.948 avg: 59.252 ns/v profile 0.394 0.500 15.062 2.575 40.608 18 | Testing size 2306: best: 57.487 avg: 61.286 ns/v profile 0.389 0.397 15.124 1.981 43.308 19 | Testing size 2997: best: 60.257 avg: 64.181 ns/v profile 0.390 0.532 15.048 3.015 45.120 20 | Testing size 3896: best: 62.910 avg: 65.783 ns/v profile 0.395 0.400 15.175 2.345 47.417 21 | Testing size 5064: best: 64.696 avg: 68.946 ns/v profile 0.383 0.599 15.060 3.555 49.312 22 | Testing size 6583: best: 65.275 avg: 69.178 ns/v profile 0.392 0.450 14.868 2.749 50.689 23 | Testing size 8557: best: 67.214 avg: 70.615 ns/v profile 0.382 0.344 14.901 2.105 52.861 24 | Testing size 11124: best: 69.130 avg: 72.550 ns/v profile 0.381 0.607 14.685 3.232 53.628 25 | Testing size 14461: best: 68.904 avg: 72.159 ns/v profile 0.386 0.422 14.543 2.483 54.311 26 | Testing size 18799: best: 69.875 avg: 72.295 ns/v profile 0.381 0.324 14.323 1.901 55.355 27 | Testing size 24438: best: 70.919 avg: 73.928 ns/v profile 0.387 0.615 14.050 2.930 55.937 28 | Testing size 31769: best: 71.505 avg: 74.148 ns/v profile 0.389 0.404 14.342 2.251 56.755 29 | Testing size 41299: best: 74.452 avg: 76.542 ns/v profile 0.386 0.807 14.084 3.452 57.804 30 | Testing size 53688: best: 73.394 avg: 75.676 ns/v profile 0.386 0.503 13.984 2.631 58.167 31 | Testing size 69794: best: 74.533 avg: 76.649 ns/v profile 0.389 0.400 14.072 2.049 59.733 32 | Testing size 90732: best: 76.409 avg: 78.538 ns/v profile 0.390 1.162 14.101 3.237 59.642 33 | Testing size 117951: best: 75.963 avg: 78.163 ns/v profile 0.391 0.711 14.297 2.500 60.261 34 | Testing size 153336: best: 77.863 avg: 79.863 ns/v profile 0.626 0.587 14.544 1.969 62.133 35 | Testing size 199336: best: 81.414 avg: 83.930 ns/v profile 0.683 2.593 14.728 3.141 62.782 36 | Testing size 259136: best: 81.221 avg: 82.074 ns/v profile 0.637 1.602 14.897 2.418 62.519 37 | Testing size 336876: best: 88.226 avg: 89.444 ns/v profile 0.639 3.766 16.589 3.748 64.700 38 | Testing size 437938: best: 86.277 avg: 86.929 ns/v profile 0.507 2.098 16.484 2.876 64.963 39 | Testing size 569319: best: 88.117 avg: 88.314 ns/v profile 0.509 1.718 16.579 2.210 67.298 40 | Testing size 740114: best: 93.214 avg: 95.134 ns/v profile 0.489 4.289 19.590 3.542 67.223 41 | Testing size 962148: best: 93.172 avg: 93.211 ns/v profile 0.495 2.156 19.520 2.764 68.275 42 | Testing size 1250792: best: 94.877 avg: 94.991 ns/v profile 0.504 1.710 19.497 2.117 71.162 43 | Testing size 1626029: best:100.211 avg:101.180 ns/v profile 0.515 6.051 19.138 3.189 72.286 44 | Testing size 2113837: best: 99.281 avg: 99.330 ns/v profile 0.503 2.087 19.194 2.465 75.081 45 | Testing size 2747988: best:103.928 avg:104.185 ns/v profile 0.508 9.083 18.793 3.747 72.053 46 | Testing size 3572384: best:101.511 avg:101.511 ns/v profile 0.509 6.979 18.853 2.882 72.288 47 | Testing size 4644099: best:101.004 avg:101.004 ns/v profile 0.503 5.346 18.725 2.187 74.242 48 | Testing size 6037328: best:100.600 avg:100.600 ns/v profile 0.498 8.069 18.717 3.390 69.925 49 | Testing size 7848526: best: 98.732 avg: 98.732 ns/v profile 0.509 6.194 18.879 2.612 70.538 50 | Testing size 10000000: best: 97.901 avg: 97.901 ns/v profile 0.500 4.849 18.729 2.045 71.777 51 | -------------------------------------------------------------------------------- /images/rand.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | Sorting random 4-byte integers 5 | Size 6 | Time (ns / value) 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 100 24 | 1000 25 | 1e4 26 | 1e5 27 | 1e6 28 | 1e7 29 | 30 | 4 31 | 5 32 | 7 33 | 10 34 | 15 35 | 20 36 | 30 37 | 40 38 | 50 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | fluxsort 57 | wolfsort 58 | ska_sort_copy 59 | Robin Hood 60 | 61 | 62 | 63 | -------------------------------------------------------------------------------- /bench.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | 6 | // Options for sorting algorithms: 7 | #if QUADSORT 8 | #define sortname "quadsort" 9 | #elif FLUXSORT 10 | #define sortname "fluxsort" 11 | #elif WOLFSORT 12 | #define sortname "wolfsort" 13 | #elif SKASORT 14 | #define sortname "ska_sort" 15 | #elif SKACOPY 16 | #define sortname "ska_sort_copy" 17 | #elif PDQSORT 18 | #define sortname "pdqsort" 19 | #elif MERGESORT 20 | #define sortname "mergesort" 21 | #elif RHMERGESORT 22 | #define sortname "rhmergesort" 23 | #elif GLIDESORT 24 | #define sortname "glidesort" 25 | #else 26 | #define sortname "rhsort" 27 | #endif 28 | 29 | #if QUADMERGE 30 | #define quadness "quad+" 31 | #else 32 | #define quadness "" 33 | #endif 34 | 35 | #if BRAVE 36 | #define bravery "brave+" 37 | #else 38 | #define bravery "" 39 | #endif 40 | 41 | // Options for test to perform: 42 | #if RANGES // Small range 43 | #define datadesc "10,000 small-range 4-byte integers" 44 | #elif WORST // RH worst case 45 | #define datadesc "small-range plus outlier" 46 | #else // Random 47 | #define datadesc "random 4-byte integers" 48 | #endif 49 | 50 | #if WORST 51 | #define MODIFY(arr) arr[0] = 3<<28 52 | #else 53 | #define MODIFY(arr) (void)0 54 | #endif 55 | 56 | typedef size_t U; 57 | static U monoclock(void) { 58 | struct timespec ts; 59 | clock_gettime(CLOCK_MONOTONIC, &ts); 60 | return 1000000000*ts.tv_sec + ts.tv_nsec; 61 | } 62 | 63 | #if PROFILE 64 | #define PROFLEN 10 65 | static U profsums[PROFLEN]; 66 | #define PROF_INIT memset(profsums, 0, sizeof(profsums)) 67 | #define PROF_START(n) U proft##n = monoclock() 68 | #define PROF_CONT(n) proft##n = monoclock() 69 | #define PROF_END(n) profsums[n] += monoclock()-proft##n 70 | static void printprof(U denom) { 71 | U l=PROFLEN; while (l && profsums[l-1]==0) l--; 72 | printf(" profile"); 73 | for (U i=0; i *(b)) 84 | #include "wolfsort/src/quadsort.h" 85 | #endif 86 | 87 | #if FLUXSORT || WOLFSORT 88 | #include "wolfsort/src/fluxsort.h" 89 | #endif 90 | 91 | #if WOLFSORT 92 | #include "wolfsort/src/wolfsort.h" 93 | #elif PDQSORT 94 | #include "wolfsort/src/pdqsort.h" 95 | #elif SKASORT || SKACOPY 96 | #include "wolfsort/src/ska_sort.hpp" 97 | #endif 98 | 99 | #if GLIDESORT 100 | extern void glidesort(T *x, U n); 101 | #endif 102 | 103 | static void sort32(T *x, U n) { 104 | #if QUADSORT 105 | quadsort32(x, n, NULL); 106 | #elif FLUXSORT 107 | fluxsort32(x, n, NULL); 108 | #elif WOLFSORT 109 | wolfsort(x, n, 4, NULL); 110 | #elif SKASORT 111 | ska_sort(x, x+n); 112 | #elif SKACOPY 113 | T *aux = malloc(n*sizeof(T)); 114 | ska_sort_copy(x, x+n, aux); 115 | free(aux); 116 | #elif PDQSORT 117 | pdqsort(x, x+n); 118 | #elif MERGESORT 119 | T *aux = malloc(n*sizeof(T)); 120 | mergefrom(x, n, 1, aux); 121 | free(aux); 122 | #elif RHMERGESORT 123 | rhmergesort32(x, n); 124 | #elif GLIDESORT 125 | glidesort(x, n); 126 | #else 127 | rhsort32(x, n); 128 | #endif 129 | } 130 | 131 | // For qsort 132 | int cmpi(const void * ap, const void * bp) { 133 | T a=*(T*)ap, b=*(T*)bp; 134 | return (a>b) - (a1) { 149 | ls = argv[1][0]=='l'; 150 | if (ls) { 151 | // Log line chart 100 to 1e7 with 44 points, plus 4 before for warmup 152 | min=0; 153 | max = (argc>2) ? atoi(argv[2]) : 48; 154 | } else { 155 | max=atoi(argv[argc-1]); 156 | if (argc>2) min=atoi(argv[argc-2]); 157 | } 158 | } 159 | 160 | U sizes[max+1]; 161 | if (!ls) { for (U k=0,n=1 ; k<=max; k++,n*=10) sizes[k]=n; } 162 | else { for (U k=0,n=34; k<=max; k++,n=n*1.3+(n<70)) sizes[k]=n; if(max==48)sizes[max]=10000000; } 163 | 164 | #ifndef RANGES 165 | U s=sizes[max]; s+=n_iter(s)-1; 166 | U q=sizes[min]; q+=n_iter(q)-1; if (s 4 | #include 5 | #include 6 | #include 7 | #include 8 | 9 | #include "rhsort.c" 10 | 11 | int cmpi(const void * ap, const void * bp) { 12 | T a=*(T*)ap, b=*(T*)bp; 13 | return (a>b) - (a>64); } 24 | static uint64_t _wymix(uint64_t A, uint64_t B){ _wymum(&A,&B); return A^B; } 25 | static uint64_t wyrand(uint64_t *seed){ *seed+=0xa0761d6478bd642full; return _wymix(*seed,*seed^0xe7037ed1a0b428dbull);} 26 | static uint64_t wy2u0k(uint64_t r, uint64_t k){ _wymum(&r,&k); return k; } 27 | #define RAND(range) wy2u0k(wyrand(&seed), range) 28 | 29 | // Floyd to get sorted sample of k indices < n 30 | // O(n^2) because it insertion sorts 31 | void sample(U *out, U k, U n, uint64_t *seedp) { 32 | uint64_t seed = *seedp; 33 | for (U i=0; ij; t--) out[t]=out[t-1]; 41 | out[j]=r; 42 | } 43 | } 44 | *seedp = seed; 45 | } 46 | 47 | // Maximum segments for piecewise-linear distribution 48 | #define MAX_SEGS 32 49 | 50 | // Generate an input with a random piecewise-linear distribution 51 | void genseq(uint64_t seed, T *array, U len) { 52 | T min = (T)wyrand(&seed), max = (T)wyrand(&seed); 53 | if (maxmax) max=e; 112 | } 113 | U r = (U)(UT)(max-min) + 1; // Size of range 114 | 115 | // Counting sort handled separately 116 | if (r/4 < n) return 0; 117 | 118 | U sh = 0; // Contract to fit range 119 | while (r>5*n) { sh++; r>>=1; } // Shrink to stay at O(n) memory 120 | 121 | U score = 0; 122 | T s[c]; // Sample 123 | for (U i=0; i> sh) 126 | for (U i=1, prev=POS(s[i-1]); ii) break; d=next-POS(s[i-o]); } 129 | } 130 | #undef POS 131 | return score; 132 | } 133 | 134 | #ifndef LENGTH 135 | #define LENGTH 10000 136 | #endif 137 | 138 | int main(int argc, char **argv) { 139 | U n = LENGTH, iter = 1+2000000/(20+n), checks = 1+10*100; 140 | #ifdef SAMPLE 141 | U cand = SAMPLE; 142 | #else 143 | U cand=1; // floor(sqrt(n)) by binary search 144 | while (4*cand*cand<=n) cand*=2; 145 | for (U c=cand;c;c/=2) if ((c+cand)*(c+cand)<=n) cand+=c; 146 | #endif 147 | U rounds = argc>1 ? atoi(argv[1]) : 150; 148 | printf("Checking %ld candidates out of %ld values\n", cand, n); 149 | U s = n*sizeof(T), si=(n+iter-1)*sizeof(T); 150 | T *data = malloc(si), // Saved random data 151 | *sort = malloc(si), // Array to be sorted 152 | *chk = malloc(si); // For checking with qsort 153 | T *crs = malloc(checks*sizeof(T)); // Criterion sample results 154 | for (U k=0; k 2 | 3 | 4 | Sorting random 4-byte integers 5 | Size 6 | Time (ns / value) 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 100 24 | 1000 25 | 1e4 26 | 1e5 27 | 1e6 28 | 1e7 29 | 30 | 4 31 | 5 32 | 7 33 | 10 34 | 15 35 | 20 36 | 30 37 | 40 38 | 50 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | fluxsort 59 | wolfsort 60 | ska_sort_copy 61 | glidesort 62 | Robin Hood 63 | 64 | 65 | 66 | -------------------------------------------------------------------------------- /rhsort.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | 4 | typedef int T; 5 | typedef unsigned int UT; 6 | typedef size_t U; 7 | #define LIKELY(X) __builtin_expect(X,1) 8 | #define RARE(X) __builtin_expect(X,0) 9 | 10 | #ifndef PROF_START 11 | #define PROF_START(n) (void)0 12 | #define PROF_CONT(n) (void)0 13 | #define PROF_END(n) (void)0 14 | #endif 15 | 16 | // Minimum size to steal from buffer 17 | static const U BLOCK = 16; 18 | 19 | #if QUADMERGE 20 | #define cmp(a,b) (*(a) > *(b)) 21 | #include "quadsort_mod.h" // Call wolfbench.sh 22 | #endif 23 | 24 | // Merge arrays of length l and n-l starting at a, using buffer aux. 25 | static void merge(T *a, U l, U n, T *aux) { 26 | #if QUADMERGE 27 | partial_backward_merge32(a, aux, n, l, NULL); 28 | #else 29 | // Easy cases when the merge can be avoided 30 | // If the buffer helping at all, most merges go through these 31 | if (a[l-1] <= a[l]) return; 32 | if (a[n-1] < a[0] && l+l==n) { 33 | T *b = a+l; 34 | for (U i=0; i=n || aux[ai]<=a[bi]) 41 | a[i] = aux[ai++]; 42 | else 43 | a[i] = a[bi++]; 44 | } 45 | #endif 46 | } 47 | 48 | // Merge array x of size n, if units of length block are pre-sorted 49 | static void mergefrom(T *x, U n, U block, T *aux) { 50 | #if QUADMERGE 51 | quad_merge32(x, aux, n, n, block, NULL); 52 | #else 53 | for (U w=block; wmax) max=e; 93 | } 94 | U r = (U)(UT)(max-min) + 1; // Size of range 95 | PROF_END(0); 96 | if (RARE(r/4 < n)) { // Counting sort if it's small 97 | PROF_START(5); count(x, n, min, r); PROF_END(5); return; 98 | } 99 | 100 | // Planning for the buffer 101 | PROF_START(1); 102 | // Sentinel value: the buffer swallows these but count recovers them 103 | T s = max; 104 | U sh = 0; // Contract to fit range 105 | while (r>5*n) { sh++; r>>=1; } // Shrink to stay at O(n) memory 106 | // Goes down to BLOCK once we know we have to merge 107 | U threshold = 2*BLOCK; 108 | U sz = r + threshold; // Buffer size 109 | #if BRAVE 110 | sz = r + n; 111 | #endif 112 | 113 | // Allocate buffer, and fill with sentinels 114 | T *aux = malloc((sz>n?sz:n)*sizeof(T)); // >=n for merges later 115 | for (U i=0; i> sh) 121 | for (U i=0; i=h; // If we have to move past that entry 134 | j += c; // Increments until e's final location found 135 | aux[f-c] = h; // Reposition h 136 | h = n; 137 | } while (h!=s); // Until the end of the chain 138 | aux[j] = e; 139 | f += 1; // To account for just-inserted e 140 | 141 | #ifndef BRAVE 142 | // Bad collision: send chain back to x 143 | if (RARE(f-j0 >= threshold)) { 144 | threshold = BLOCK; 145 | // Find the beginning of the chain (required for stability) 146 | while (j0 && aux[j0-1]!=s) j0--; 147 | // Move as many blocks from it as possible 148 | T *hj = aux+j0, *hf = aux+f; 149 | while (hj <= hf-BLOCK) { 150 | for (U i=0; ipr ? pp : pr; 159 | aux[pr++] = e; 160 | } 161 | } 162 | #endif 163 | } 164 | #undef POS 165 | PROF_END(2); 166 | 167 | // Move all values from the buffer back to the array 168 | // Use xt += to convince the compiler to make it branchless 169 | PROF_START(3); 170 | while (aux[--sz] == s); sz++; 171 | T *xt=xb; 172 | { 173 | static const U u=8; // Unrolling size 174 | #define WR(I) xt += s!=(*xt=aux[i+I]) 175 | U i=0; 176 | for (; i<(sz&~(u-1)); i+=u) { WR(0); WR(1); WR(2); WR(3); WR(4); WR(5); WR(6); WR(7); } 177 | for (; ii+size ? size : n-i); 204 | PROF_START(5); 205 | T *aux = malloc(n*sizeof(T)); 206 | mergefrom(x, n, size, aux); 207 | free(aux); 208 | PROF_END(5); 209 | } 210 | -------------------------------------------------------------------------------- /images/wolf.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | The Big Bad Wolfsort benchmark 5 | Sort 100,000 4-byte integers, average of 100 6 | Nanoseconds per element 7 | i5-6200U CPU @ 2.30GHz; compiled with g++ -O3 8 | 9 | 10 | 11 | 12 | 13 | 14 | 0 15 | 10 16 | 20 17 | 30 18 | 19 | 20 | Random 21 | Rand. % 100 22 | Ascending 23 | Asc. saw 24 | Pipe organ 25 | Descending 26 | Desc. saw 27 | Random tail 28 | Random half 29 | Asc. tiles 30 | Bit reversal 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | 71 | 72 | 73 | 74 | 75 | 76 | 77 | 78 | 79 | 80 | 81 | 82 | 83 | 84 | 85 | 86 | 87 | 88 | 89 | 90 | 91 | 92 | 93 | 94 | 95 | 96 | 97 | 98 | 99 | 100 | 101 | 102 | 103 | 104 | 105 | 106 | 107 | 108 | 109 | 110 | 111 | 112 | 113 | 114 | 115 | 116 | 117 | 118 | 119 | 120 | 121 | 122 | 123 | quadsort 124 | pdqsort 125 | fluxsort 126 | wolfsort 127 | ska_sort_copy 128 | Robin Hood 129 | 130 | 131 | 132 | -------------------------------------------------------------------------------- /res/wolf.txt: -------------------------------------------------------------------------------- 1 | Info: int = 32, long long = 64, long double = 128 2 | 3 | Benchmark: array size: 100000, samples: 20, repetitions: 1, seed: 1675911835 4 | 5 | | Name | Items | Type | Best | Average | Compares | Samples | Distribution | 6 | | --------- | -------- | ---- | -------- | -------- | --------- | ------- | ---------------- | 7 | | quadsort | 100000 | 32 | 0.003272 | 0.003319 | 0 | 20 | random order | 8 | | pdqsort | 100000 | 32 | 0.003448 | 0.003480 | 0 | 20 | random order | 9 | | fluxsort | 100000 | 32 | 0.002542 | 0.002609 | 0 | 20 | random order | 10 | | wolfsort | 100000 | 32 | 0.002180 | 0.002322 | 0 | 20 | random order | 11 | | ska_sort | 100000 | 32 | 0.000906 | 0.000931 | 0 | 20 | random order | 12 | | rhsort | 100000 | 32 | 0.000889 | 0.000923 | 0 | 20 | random order | 13 | | | | | | | | | | 14 | | quadsort | 100000 | 32 | 0.002357 | 0.002384 | 0 | 20 | random % 100 | 15 | | pdqsort | 100000 | 32 | 0.000953 | 0.000968 | 0 | 20 | random % 100 | 16 | | fluxsort | 100000 | 32 | 0.000939 | 0.000946 | 0 | 20 | random % 100 | 17 | | wolfsort | 100000 | 32 | 0.000468 | 0.000475 | 0 | 20 | random % 100 | 18 | | ska_sort | 100000 | 32 | 0.001006 | 0.001033 | 0 | 20 | random % 100 | 19 | | rhsort | 100000 | 32 | 0.000152 | 0.000152 | 0 | 20 | random % 100 | 20 | | | | | | | | | | 21 | | quadsort | 100000 | 32 | 0.000066 | 0.000066 | 0 | 20 | ascending order | 22 | | pdqsort | 100000 | 32 | 0.000114 | 0.000118 | 0 | 20 | ascending order | 23 | | fluxsort | 100000 | 32 | 0.000045 | 0.000045 | 0 | 20 | ascending order | 24 | | wolfsort | 100000 | 32 | 0.000115 | 0.000116 | 0 | 20 | ascending order | 25 | | ska_sort | 100000 | 32 | 0.001098 | 0.001127 | 0 | 20 | ascending order | 26 | | rhsort | 100000 | 32 | 0.000424 | 0.000451 | 0 | 20 | ascending order | 27 | | | | | | | | | | 28 | | quadsort | 100000 | 32 | 0.000803 | 0.000811 | 0 | 20 | ascending saw | 29 | | pdqsort | 100000 | 32 | 0.004132 | 0.004171 | 0 | 20 | ascending saw | 30 | | fluxsort | 100000 | 32 | 0.000438 | 0.000445 | 0 | 20 | ascending saw | 31 | | wolfsort | 100000 | 32 | 0.000509 | 0.000513 | 0 | 20 | ascending saw | 32 | | ska_sort | 100000 | 32 | 0.000937 | 0.000958 | 0 | 20 | ascending saw | 33 | | rhsort | 100000 | 32 | 0.000794 | 0.000820 | 0 | 20 | ascending saw | 34 | | | | | | | | | | 35 | | quadsort | 100000 | 32 | 0.000500 | 0.000506 | 0 | 20 | pipe organ | 36 | | pdqsort | 100000 | 32 | 0.003518 | 0.003558 | 0 | 20 | pipe organ | 37 | | fluxsort | 100000 | 32 | 0.000265 | 0.000267 | 0 | 20 | pipe organ | 38 | | wolfsort | 100000 | 32 | 0.000337 | 0.000339 | 0 | 20 | pipe organ | 39 | | ska_sort | 100000 | 32 | 0.000930 | 0.000973 | 0 | 20 | pipe organ | 40 | | rhsort | 100000 | 32 | 0.000758 | 0.000789 | 0 | 20 | pipe organ | 41 | | | | | | | | | | 42 | | quadsort | 100000 | 32 | 0.000079 | 0.000084 | 0 | 20 | descending order | 43 | | pdqsort | 100000 | 32 | 0.000233 | 0.000240 | 0 | 20 | descending order | 44 | | fluxsort | 100000 | 32 | 0.000059 | 0.000061 | 0 | 20 | descending order | 45 | | wolfsort | 100000 | 32 | 0.000130 | 0.000132 | 0 | 20 | descending order | 46 | | ska_sort | 100000 | 32 | 0.001013 | 0.001040 | 0 | 20 | descending order | 47 | | rhsort | 100000 | 32 | 0.000591 | 0.000742 | 0 | 20 | descending order | 48 | | | | | | | | | | 49 | | quadsort | 100000 | 32 | 0.000852 | 0.000872 | 0 | 20 | descending saw | 50 | | pdqsort | 100000 | 32 | 0.005182 | 0.005273 | 0 | 20 | descending saw | 51 | | fluxsort | 100000 | 32 | 0.000571 | 0.000586 | 0 | 20 | descending saw | 52 | | wolfsort | 100000 | 32 | 0.000643 | 0.000659 | 0 | 20 | descending saw | 53 | | ska_sort | 100000 | 32 | 0.000929 | 0.000976 | 0 | 20 | descending saw | 54 | | rhsort | 100000 | 32 | 0.000778 | 0.000809 | 0 | 20 | descending saw | 55 | | | | | | | | | | 56 | | quadsort | 100000 | 32 | 0.001040 | 0.001054 | 0 | 20 | random tail | 57 | | pdqsort | 100000 | 32 | 0.003266 | 0.003308 | 0 | 20 | random tail | 58 | | fluxsort | 100000 | 32 | 0.000827 | 0.000838 | 0 | 20 | random tail | 59 | | wolfsort | 100000 | 32 | 0.000747 | 0.000758 | 0 | 20 | random tail | 60 | | ska_sort | 100000 | 32 | 0.000923 | 0.000956 | 0 | 20 | random tail | 61 | | rhsort | 100000 | 32 | 0.000816 | 0.000838 | 0 | 20 | random tail | 62 | | | | | | | | | | 63 | | quadsort | 100000 | 32 | 0.001896 | 0.001920 | 0 | 20 | random half | 64 | | pdqsort | 100000 | 32 | 0.003396 | 0.003438 | 0 | 20 | random half | 65 | | fluxsort | 100000 | 32 | 0.001463 | 0.001490 | 0 | 20 | random half | 66 | | wolfsort | 100000 | 32 | 0.001280 | 0.001302 | 0 | 20 | random half | 67 | | ska_sort | 100000 | 32 | 0.000912 | 0.000933 | 0 | 20 | random half | 68 | | rhsort | 100000 | 32 | 0.000839 | 0.000866 | 0 | 20 | random half | 69 | | | | | | | | | | 70 | | quadsort | 100000 | 32 | 0.001052 | 0.001088 | 0 | 20 | ascending tiles | 71 | | pdqsort | 100000 | 32 | 0.002936 | 0.002991 | 0 | 20 | ascending tiles | 72 | | fluxsort | 100000 | 32 | 0.000433 | 0.000436 | 0 | 20 | ascending tiles | 73 | | wolfsort | 100000 | 32 | 0.000955 | 0.000967 | 0 | 20 | ascending tiles | 74 | | ska_sort | 100000 | 32 | 0.001771 | 0.001798 | 0 | 20 | ascending tiles | 75 | | rhsort | 100000 | 32 | 0.002922 | 0.002981 | 0 | 20 | ascending tiles | 76 | | | | | | | | | | 77 | | quadsort | 100000 | 32 | 0.002953 | 0.003017 | 0 | 20 | bit reversal | 78 | | pdqsort | 100000 | 32 | 0.003415 | 0.003459 | 0 | 20 | bit reversal | 79 | | fluxsort | 100000 | 32 | 0.002373 | 0.002427 | 0 | 20 | bit reversal | 80 | | wolfsort | 100000 | 32 | 0.002043 | 0.002129 | 0 | 20 | bit reversal | 81 | | ska_sort | 100000 | 32 | 0.000972 | 0.001004 | 0 | 20 | bit reversal | 82 | | rhsort | 100000 | 32 | 0.001010 | 0.001047 | 0 | 20 | bit reversal | 83 | -------------------------------------------------------------------------------- /images/parts.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | Sorting random 4-byte integers 5 | Size 6 | Time (ns / value) 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 100 25 | 1000 26 | 1e4 27 | 1e5 28 | 1e6 29 | 1e7 30 | 31 | 5 32 | 10 33 | 15 34 | 20 35 | 25 36 | 30 37 | 35 38 | 40 39 | 45 40 | 50 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | fluxsort 63 | Robin Hood 64 | 65 | 66 | 67 | -------------------------------------------------------------------------------- /images/bad.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | Sorting small-range plus outlier 5 | Size 6 | Time (ns / value) 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 100 26 | 1000 27 | 1e4 28 | 1e5 29 | 1e6 30 | 1e7 31 | 32 | 10 33 | 20 34 | 30 35 | 40 36 | 50 37 | 60 38 | 70 39 | 80 40 | 90 41 | 100 42 | 110 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | Robin Hood 71 | mergesort 72 | quadsort 73 | fluxsort 74 | 75 | 76 | 77 | -------------------------------------------------------------------------------- /res/crit.txt: -------------------------------------------------------------------------------- 1 | Checking 100 candidates out of 10000 values 2 | Testing seed 101: 2 22 31 38 44 51 59 66 74 88 185 best: 12.687 avg: 13.375 ns/v 3 | Testing seed 112: 244 368 404 429 450 471 497 524 554 599 817 best: 25.901 avg: 26.993 ns/v 4 | Testing seed 123: 897 1317 1444 1523 1616 1685 1767 1846 1942 2106 2734 best: 27.117 avg: 28.282 ns/v 5 | Testing seed 134: 66 127 152 166 181 197 210 227 248 279 429 best: 19.131 avg: 19.851 ns/v 6 | Testing seed 145: 144 224 256 272 304 320 336 352 384 416 576 best: 31.418 avg: 32.149 ns/v 7 | Testing seed 156: 384 496 528 560 592 608 640 672 704 752 1104 best: 31.708 avg: 32.473 ns/v 8 | Testing seed 167: 7 37 47 55 61 69 76 85 95 109 236 best: 10.798 avg: 11.239 ns/v 9 | Testing seed 178: 0 16 16 32 32 48 48 64 64 80 176 best: 18.689 avg: 19.275 ns/v 10 | Testing seed 189: 63 140 157 171 184 198 213 228 247 274 408 best: 20.689 avg: 21.374 ns/v 11 | Testing seed 200: 128 224 256 272 304 320 336 352 384 416 656 best: 31.448 avg: 32.098 ns/v 12 | Testing seed 211: 2 45 55 63 71 79 87 97 108 122 203 best: 12.517 avg: 12.930 ns/v 13 | Testing seed 222: 58 119 137 148 160 171 184 197 215 242 344 best: 21.025 avg: 21.556 ns/v 14 | Testing seed 233: 20 63 75 86 97 106 115 126 142 162 250 best: 23.703 avg: 24.429 ns/v 15 | Testing seed 244: 0 16 24 31 36 41 47 54 63 74 145 best: 7.401 avg: 7.689 ns/v 16 | Testing seed 255: 468 622 665 692 719 748 775 806 839 893 1172 best: 31.481 avg: 32.361 ns/v 17 | Testing seed 266: 8 35 46 54 60 69 76 84 96 111 194 best: 10.498 avg: 11.003 ns/v 18 | Testing seed 277: 105 241 282 315 349 377 414 446 502 565 1057 best: 17.084 avg: 17.955 ns/v 19 | Testing seed 288: 48 96 112 128 144 160 160 176 192 224 368 best: 30.759 avg: 31.664 ns/v 20 | Testing seed 299: 1 39 49 58 65 72 79 88 100 116 204 best: 11.929 avg: 12.470 ns/v 21 | Testing seed 310: 5 29 39 47 55 62 68 76 86 99 175 best: 7.220 avg: 7.486 ns/v 22 | Testing seed 321: 12 50 62 73 81 89 97 106 117 135 204 best: 13.156 avg: 13.737 ns/v 23 | Testing seed 332: 32 96 128 128 144 160 176 192 208 240 352 best: 28.904 avg: 29.709 ns/v 24 | Testing seed 343: 144 240 272 288 304 320 336 352 368 416 544 best: 29.942 avg: 30.524 ns/v 25 | Testing seed 354: 53 124 141 158 175 189 204 223 247 281 464 best: 18.651 avg: 19.287 ns/v 26 | Testing seed 365: 182 323 364 393 420 445 469 494 529 592 897 best: 27.035 avg: 27.833 ns/v 27 | Testing seed 376: 25 56 66 76 84 92 99 108 120 139 231 best: 19.226 avg: 19.682 ns/v 28 | Testing seed 387: 133 323 376 417 444 478 515 558 614 699 1500 best: 23.470 avg: 24.406 ns/v 29 | Testing seed 398: 0 19 27 34 39 45 51 58 65 79 175 best: 8.053 avg: 8.394 ns/v 30 | Testing seed 409: 102 207 230 249 266 281 301 320 343 378 633 best: 21.612 avg: 22.483 ns/v 31 | Testing seed 420: 134 258 289 311 334 354 376 398 426 470 732 best: 22.964 avg: 23.699 ns/v 32 | Testing seed 431: 5 40 50 61 69 77 86 95 104 120 200 best: 12.000 avg: 12.388 ns/v 33 | Testing seed 442: 170 316 352 381 408 432 457 485 525 577 857 best: 20.308 avg: 21.224 ns/v 34 | Testing seed 453: 57 123 140 153 167 179 191 204 223 242 477 best: 21.455 avg: 22.089 ns/v 35 | Testing seed 464: 66 134 155 174 189 204 220 237 256 293 585 best: 19.623 avg: 20.161 ns/v 36 | Testing seed 475: 14 45 57 68 75 83 91 102 114 131 233 best: 11.532 avg: 11.954 ns/v 37 | Testing seed 486: 50 148 188 211 236 260 285 314 348 408 679 best: 15.099 avg: 15.894 ns/v 38 | Testing seed 497: 298 534 590 632 679 718 759 804 861 946 1626 best: 31.233 avg: 32.230 ns/v 39 | Testing seed 508: 105 228 259 280 298 314 335 355 384 422 604 best: 25.051 avg: 25.849 ns/v 40 | Testing seed 519: 16 83 98 109 121 133 144 155 169 190 322 best: 16.741 avg: 17.223 ns/v 41 | Testing seed 530: 24 72 86 98 109 119 130 143 160 185 283 best: 13.165 avg: 13.844 ns/v 42 | Testing seed 541: 0 25 33 41 47 53 60 68 77 91 149 best: 7.391 avg: 7.754 ns/v 43 | Testing seed 552: 144 240 256 288 304 320 336 352 384 416 560 best: 31.565 avg: 32.457 ns/v 44 | Testing seed 563: 35 75 92 104 114 124 135 145 160 182 263 best: 22.291 avg: 22.770 ns/v 45 | Testing seed 574: 57 158 190 224 256 286 317 360 403 493 941 best: 13.316 avg: 13.825 ns/v 46 | Testing seed 585: 188 289 318 339 356 375 394 411 439 476 650 best: 25.130 avg: 25.867 ns/v 47 | Testing seed 596: 10 56 67 78 85 93 102 111 122 139 216 best: 16.869 avg: 17.359 ns/v 48 | Testing seed 607: 144 358 429 488 538 580 629 684 757 861 1380 best: 18.049 avg: 18.950 ns/v 49 | Testing seed 618: 4198 5223 5456 5621 5795 5984 6152 6323 6517 6804 8144 best: 33.929 avg: 35.275 ns/v 50 | Testing seed 629: 41 99 118 130 142 154 166 179 197 225 379 best: 17.733 avg: 18.387 ns/v 51 | Testing seed 640: 80 158 178 193 214 227 243 260 281 312 497 best: 26.430 avg: 27.237 ns/v 52 | Testing seed 651: 27 96 114 131 146 160 181 203 236 281 542 best: 13.779 avg: 14.210 ns/v 53 | Testing seed 662: 15 62 75 85 93 101 109 121 132 151 226 best: 15.505 avg: 15.843 ns/v 54 | Testing seed 673: 28 93 110 125 137 149 161 175 194 223 412 best: 21.417 avg: 21.938 ns/v 55 | Testing seed 684: 1047 1507 1627 1719 1785 1843 1920 1989 2078 2218 2849 best: 27.858 avg: 28.901 ns/v 56 | Testing seed 695: 43 98 116 131 143 153 164 179 196 221 339 best: 25.460 avg: 26.353 ns/v 57 | Testing seed 706: 486 784 849 901 946 996 1050 1096 1160 1260 2226 best: 27.554 avg: 28.576 ns/v 58 | Testing seed 717: 95 214 242 268 290 317 341 365 394 451 635 best: 23.458 avg: 24.240 ns/v 59 | Testing seed 728: 390 503 534 563 586 605 626 652 679 726 942 best: 30.480 avg: 31.821 ns/v 60 | Testing seed 739: 76 150 173 191 207 222 237 252 274 309 514 best: 20.229 avg: 21.003 ns/v 61 | Testing seed 750: 8 36 46 55 62 69 80 90 100 119 238 best: 11.610 avg: 12.054 ns/v 62 | Testing seed 761: 3 37 48 56 63 69 75 83 93 109 182 best: 8.374 avg: 8.769 ns/v 63 | Testing seed 772: 115 240 268 293 314 340 362 390 418 467 683 best: 26.760 avg: 27.402 ns/v 64 | Testing seed 783: 0 38 50 59 67 75 84 93 104 122 223 best: 11.546 avg: 12.008 ns/v 65 | Testing seed 794: 10 43 55 64 70 78 85 92 104 120 210 best: 9.879 avg: 10.351 ns/v 66 | Testing seed 805: 106 186 207 223 237 252 270 284 305 335 531 best: 24.280 avg: 24.901 ns/v 67 | Testing seed 816: 222 448 520 577 633 682 735 792 878 1024 1759 best: 20.679 avg: 22.225 ns/v 68 | Testing seed 827: 579 958 1067 1136 1200 1269 1331 1396 1494 1623 2513 best: 30.363 avg: 31.106 ns/v 69 | Testing seed 838: 0 37 49 57 65 72 79 87 97 110 205 best: 9.932 avg: 10.244 ns/v 70 | Testing seed 849: 59 169 208 233 260 289 315 351 397 455 820 best: 15.009 avg: 15.593 ns/v 71 | Testing seed 860: 207 473 550 617 678 728 798 873 983 1109 2410 best: 24.271 avg: 24.997 ns/v 72 | Testing seed 871: 0 15 21 26 31 37 43 48 56 67 129 best: 5.915 avg: 6.085 ns/v 73 | Testing seed 882: 70 148 175 196 214 235 255 277 306 347 645 best: 18.859 avg: 19.657 ns/v 74 | Testing seed 893: 165 305 346 381 421 458 492 533 591 667 1193 best: 19.968 avg: 20.593 ns/v 75 | Testing seed 904: 242 486 567 630 697 756 818 885 986 1106 2246 best: 24.337 avg: 25.199 ns/v 76 | Testing seed 915: 32 96 112 128 144 160 176 192 208 224 416 best: 29.902 avg: 30.509 ns/v 77 | Testing seed 926: 66 137 163 184 202 221 241 267 297 345 633 best: 17.212 avg: 17.793 ns/v 78 | Testing seed 937: 26 81 103 122 141 160 183 213 255 326 936 best: 18.610 avg: 19.093 ns/v 79 | Testing seed 948: 22 57 73 84 97 108 121 133 152 180 380 best: 13.326 avg: 13.933 ns/v 80 | Testing seed 959: 36 152 180 200 221 242 263 286 310 353 583 best: 23.165 avg: 23.902 ns/v 81 | Testing seed 970: 175 468 570 670 754 837 925 1029 1159 1339 2416 best: 19.790 avg: 20.677 ns/v 82 | Testing seed 981: 12 61 75 90 103 113 127 142 162 194 530 best: 10.035 avg: 10.606 ns/v 83 | Testing seed 992: 0 19 28 34 40 45 51 58 68 82 154 best: 11.273 avg: 11.685 ns/v 84 | Testing seed 1003: 0 18 27 33 38 44 50 56 62 77 159 best: 6.600 avg: 6.899 ns/v 85 | Testing seed 1014: 50 130 151 167 179 194 205 220 242 269 444 best: 29.005 avg: 29.915 ns/v 86 | Testing seed 1025: 144 260 297 324 351 376 403 436 476 528 848 best: 21.699 avg: 22.526 ns/v 87 | Testing seed 1036: 35 154 184 209 237 260 289 323 366 433 894 best: 19.413 avg: 20.113 ns/v 88 | Testing seed 1047: 2 30 40 48 55 62 68 77 87 103 225 best: 17.266 avg: 17.895 ns/v 89 | Testing seed 1058: 163 274 315 339 364 389 417 449 484 541 809 best: 20.392 avg: 21.275 ns/v 90 | Testing seed 1069: 101 189 215 239 261 280 310 336 379 435 987 best: 20.886 avg: 21.657 ns/v 91 | Testing seed 1080: 80 159 189 209 227 247 268 289 316 355 621 best: 16.858 avg: 17.597 ns/v 92 | Testing seed 1091: 0 37 48 56 63 71 78 86 97 111 208 best: 19.101 avg: 19.691 ns/v 93 | Testing seed 1102: 69 132 159 174 188 204 221 244 272 306 514 best: 18.737 avg: 19.494 ns/v 94 | Testing seed 1113: 6 57 71 82 92 101 112 121 135 153 281 best: 14.267 avg: 14.939 ns/v 95 | Testing seed 1124: 404 664 731 778 819 863 899 943 1017 1100 1569 best: 27.161 avg: 28.238 ns/v 96 | Testing seed 1135: 98 234 268 296 323 350 382 414 460 526 815 best: 19.759 avg: 20.602 ns/v 97 | Testing seed 1146: 32 195 224 246 265 282 303 323 347 385 609 best: 20.289 avg: 21.130 ns/v 98 | Testing seed 1157: 4 41 53 62 69 77 84 93 104 119 216 best: 11.700 avg: 12.299 ns/v 99 | Testing seed 1168: 0 32 48 64 64 80 96 96 112 128 240 best: 21.763 avg: 22.478 ns/v 100 | Testing seed 1179: 96 224 256 272 288 320 336 352 384 416 608 best: 30.552 avg: 31.420 ns/v 101 | Testing seed 1190: 4 45 56 64 72 80 88 97 109 125 186 best: 11.634 avg: 12.036 ns/v 102 | Testing seed 1201: 3 27 38 44 50 57 64 73 83 97 176 best: 13.986 avg: 14.716 ns/v 103 | Testing seed 1212: 31 83 102 116 126 139 154 172 188 218 365 best: 13.440 avg: 14.014 ns/v 104 | Testing seed 1223: 159 464 558 636 711 777 850 953 1067 1263 2480 best: 30.571 avg: 31.606 ns/v 105 | Testing seed 1234: 0 25 33 40 46 53 60 67 76 90 136 best: 6.030 avg: 6.256 ns/v 106 | Testing seed 1245: 52 232 276 315 352 397 442 503 588 726 1466 best: 19.529 avg: 20.368 ns/v 107 | Testing seed 1256: 25 64 79 89 99 110 119 132 146 170 285 best: 20.841 avg: 21.530 ns/v 108 | Testing seed 1267: 0 25 34 41 47 52 59 66 77 91 169 best: 7.199 avg: 7.517 ns/v 109 | Testing seed 1278: 419 609 653 679 713 740 769 806 839 902 1497 best: 31.012 avg: 31.998 ns/v 110 | Testing seed 1289: 11 45 57 67 75 82 90 100 112 129 219 best: 12.455 avg: 13.078 ns/v 111 | Testing seed 1300: 0 28 37 45 51 57 64 71 82 98 178 best: 8.158 avg: 8.668 ns/v 112 | Testing seed 1311: 18 65 80 92 103 115 123 133 146 169 303 best: 25.048 avg: 25.860 ns/v 113 | Testing seed 1322: 71 140 158 174 188 201 213 227 247 275 416 best: 21.646 avg: 22.389 ns/v 114 | Testing seed 1333: 396 533 567 594 619 642 666 695 727 769 1024 best: 31.002 avg: 31.853 ns/v 115 | Testing seed 1344: 68 146 169 191 208 228 247 273 310 359 684 best: 18.614 avg: 19.200 ns/v 116 | Testing seed 1355: 16 96 112 144 144 160 176 192 208 224 384 best: 29.477 avg: 30.176 ns/v 117 | Testing seed 1366: 162 272 302 326 348 366 389 412 440 483 698 best: 29.898 avg: 30.625 ns/v 118 | Testing seed 1377: 18 74 91 102 115 127 140 153 170 197 362 best: 13.596 avg: 14.165 ns/v 119 | Testing seed 1388: 8 71 87 101 114 126 138 151 169 197 363 best: 20.176 avg: 20.845 ns/v 120 | Testing seed 1399: 14 62 76 89 100 110 121 135 150 176 367 best: 16.659 avg: 17.190 ns/v 121 | Testing seed 1410: 43 134 160 182 204 223 246 277 319 365 715 best: 18.208 avg: 18.986 ns/v 122 | Testing seed 1421: 4 30 40 48 55 61 68 77 88 99 165 best: 10.210 avg: 10.608 ns/v 123 | Testing seed 1432: 8 75 90 101 112 123 133 148 163 182 281 best: 15.990 avg: 16.771 ns/v 124 | Testing seed 1443: 23 73 90 101 115 126 137 152 166 194 487 best: 14.163 avg: 14.794 ns/v 125 | Testing seed 1454: 5 37 48 56 64 72 79 87 97 112 183 best: 11.157 avg: 11.644 ns/v 126 | Testing seed 1465: 122 255 300 329 359 389 420 455 501 587 1098 best: 20.780 avg: 21.449 ns/v 127 | Testing seed 1476: 66 121 141 159 173 187 200 218 237 274 461 best: 19.697 avg: 20.390 ns/v 128 | Testing seed 1487: 128 224 256 272 288 320 336 352 384 416 560 best: 30.297 avg: 31.121 ns/v 129 | Testing seed 1498: 10 51 64 74 83 90 98 109 122 141 253 best: 13.245 avg: 13.745 ns/v 130 | Testing seed 1509: 34 89 105 117 129 141 153 167 185 212 357 best: 16.613 avg: 17.388 ns/v 131 | Testing seed 1520: 32 96 112 128 144 160 176 192 208 240 400 best: 32.224 avg: 33.245 ns/v 132 | Testing seed 1531: 18 90 113 135 153 171 195 221 252 305 584 best: 14.738 avg: 15.424 ns/v 133 | Testing seed 1542: 84 162 192 212 232 250 272 296 322 362 681 best: 17.145 avg: 17.917 ns/v 134 | Testing seed 1553: 8 53 66 74 84 92 101 112 127 144 350 best: 12.610 avg: 13.121 ns/v 135 | Testing seed 1564: 58 125 149 169 186 204 220 243 267 316 526 best: 15.012 avg: 15.863 ns/v 136 | Testing seed 1575: 52 104 120 136 149 160 176 192 208 229 360 best: 28.899 avg: 29.723 ns/v 137 | Testing seed 1586: 0 16 23 30 35 40 46 53 60 72 130 best: 6.605 avg: 6.871 ns/v 138 | Testing seed 1597: 80 150 175 196 214 233 248 267 298 340 621 best: 18.922 avg: 19.766 ns/v 139 | Testing seed 1608: 75 180 208 232 251 269 290 312 339 387 672 best: 17.959 avg: 18.590 ns/v 140 | Testing seed 1619: 22 98 115 129 143 156 172 188 211 236 420 best: 17.246 avg: 17.832 ns/v 141 | Testing seed 1630: 791 1011 1086 1141 1196 1263 1323 1374 1448 1557 2451 best: 32.417 avg: 33.416 ns/v 142 | Testing seed 1641: 10 43 54 62 69 76 84 92 105 121 219 best: 10.808 avg: 11.303 ns/v 143 | Testing seed 1652: 118 341 417 479 525 583 640 709 802 943 2523 best: 20.814 avg: 21.722 ns/v 144 | Testing seed 1663: 65 154 185 210 227 244 262 281 303 354 622 best: 23.885 avg: 24.734 ns/v 145 | Testing seed 1674: 251 445 499 546 586 626 664 710 775 856 1179 best: 18.901 avg: 19.701 ns/v 146 | Testing seed 1685: 16 71 85 99 112 125 139 152 168 194 567 best: 12.717 avg: 13.278 ns/v 147 | Testing seed 1696: 68 199 241 277 312 346 386 444 496 599 1133 best: 20.872 avg: 21.557 ns/v 148 | Testing seed 1707: 0 34 44 52 58 64 72 80 91 107 176 best: 8.539 avg: 8.966 ns/v 149 | Testing seed 1718: 59 130 148 166 181 194 210 223 242 278 481 best: 28.508 avg: 29.367 ns/v 150 | Testing seed 1729: 99 229 278 314 348 381 414 467 521 646 1290 best: 20.460 avg: 21.083 ns/v 151 | Testing seed 1740: 13 71 85 94 104 114 123 132 145 163 253 best: 18.178 avg: 18.687 ns/v 152 | -------------------------------------------------------------------------------- /res/crit5.txt: -------------------------------------------------------------------------------- 1 | Checking 316 candidates out of 100000 values 2 | Testing seed 101: 0 56 70 83 96 102 116 130 146 164 272 best: 30.876 avg: 32.242 ns/v 3 | Testing seed 112: 323 500 534 560 584 609 634 661 699 739 983 best: 29.565 avg: 30.007 ns/v 4 | Testing seed 123: 1476 1902 2007 2071 2132 2192 2254 2330 2413 2537 3099 best: 30.313 avg: 30.721 ns/v 5 | Testing seed 134: 23 81 95 106 118 127 137 150 165 183 277 best: 23.475 avg: 25.138 ns/v 6 | Testing seed 145: 2448 2848 2928 2992 3040 3104 3152 3232 3296 3408 3920 best: 33.104 avg: 33.369 ns/v 7 | Testing seed 156: 5296 5824 5952 6048 6160 6224 6304 6400 6496 6640 7632 best: 32.270 avg: 32.822 ns/v 8 | Testing seed 167: 3 50 62 71 79 86 94 102 113 129 208 best: 16.226 avg: 16.528 ns/v 9 | Testing seed 178: 160 288 336 352 368 384 416 432 464 496 656 best: 39.525 avg: 40.183 ns/v 10 | Testing seed 189: 106 188 207 226 238 253 270 286 303 334 483 best: 29.150 avg: 29.568 ns/v 11 | Testing seed 200: 2496 2832 2944 3008 3056 3104 3168 3232 3296 3408 3824 best: 34.789 avg: 35.272 ns/v 12 | Testing seed 211: 23 57 70 80 91 100 109 118 128 145 216 best: 17.683 avg: 18.133 ns/v 13 | Testing seed 222: 23 72 88 98 106 116 125 136 147 166 257 best: 20.984 avg: 21.895 ns/v 14 | Testing seed 233: 480 656 688 736 752 784 816 848 880 928 1152 best: 36.953 avg: 37.516 ns/v 15 | Testing seed 244: 0 18 33 36 49 52 65 68 82 97 179 best: 21.550 avg: 21.860 ns/v 16 | Testing seed 255: 5296 5856 5968 6080 6160 6224 6304 6368 6464 6640 7296 best: 32.158 avg: 32.983 ns/v 17 | Testing seed 266: 14 46 60 67 76 84 92 101 112 129 188 best: 15.302 avg: 15.720 ns/v 18 | Testing seed 277: 211 370 417 446 477 504 531 562 602 650 948 best: 24.553 avg: 25.415 ns/v 19 | Testing seed 288: 1104 1360 1424 1472 1504 1552 1600 1648 1696 1760 2128 best: 34.931 avg: 35.504 ns/v 20 | Testing seed 299: 1 53 66 75 83 91 100 110 121 137 250 best: 16.131 avg: 16.665 ns/v 21 | Testing seed 310: 0 39 51 59 66 74 81 91 100 114 204 best: 11.157 avg: 11.519 ns/v 22 | Testing seed 321: 32 67 80 92 101 110 120 129 141 159 276 best: 20.799 avg: 21.423 ns/v 23 | Testing seed 332: 1168 1376 1440 1472 1520 1568 1600 1648 1696 1760 2080 best: 34.111 avg: 34.862 ns/v 24 | Testing seed 343: 2496 2832 2928 3008 3056 3104 3168 3232 3296 3424 3872 best: 32.875 avg: 33.473 ns/v 25 | Testing seed 354: 36 80 93 105 116 127 136 146 163 184 297 best: 18.772 avg: 19.553 ns/v 26 | Testing seed 365: 619 830 882 914 943 972 1001 1037 1074 1130 1404 best: 34.667 avg: 35.224 ns/v 27 | Testing seed 376: 48 128 160 176 192 208 208 224 256 272 416 best: 35.159 avg: 35.727 ns/v 28 | Testing seed 387: 522 897 1040 1146 1252 1374 1486 1637 1805 2091 3557 best: 27.685 avg: 28.035 ns/v 29 | Testing seed 398: 3 33 40 50 55 60 71 79 90 104 180 best: 21.849 avg: 22.338 ns/v 30 | Testing seed 409: 190 283 310 334 352 368 387 405 431 465 697 best: 30.489 avg: 30.940 ns/v 31 | Testing seed 420: 231 347 378 402 421 440 459 483 510 544 750 best: 27.216 avg: 27.728 ns/v 32 | Testing seed 431: 19 57 68 78 87 95 103 112 124 141 229 best: 19.224 avg: 19.870 ns/v 33 | Testing seed 442: 300 442 478 505 528 551 574 605 634 682 999 best: 24.688 avg: 25.351 ns/v 34 | Testing seed 453: 18 80 92 104 113 123 134 145 157 176 286 best: 28.335 avg: 29.181 ns/v 35 | Testing seed 464: 23 86 100 111 121 131 141 153 167 186 269 best: 19.183 avg: 19.776 ns/v 36 | Testing seed 475: 20 65 79 86 94 102 112 124 138 155 272 best: 15.343 avg: 15.690 ns/v 37 | Testing seed 486: 101 253 288 313 339 359 384 409 446 486 711 best: 24.115 avg: 24.662 ns/v 38 | Testing seed 497: 1136 1360 1424 1472 1520 1552 1600 1648 1696 1776 2192 best: 34.367 avg: 35.193 ns/v 39 | Testing seed 508: 124 212 236 254 269 285 298 314 341 373 545 best: 36.822 avg: 37.887 ns/v 40 | Testing seed 519: 15 50 62 71 79 86 94 104 113 127 204 best: 15.829 avg: 16.432 ns/v 41 | Testing seed 530: 35 94 113 124 137 148 159 173 186 207 323 best: 19.329 avg: 19.775 ns/v 42 | Testing seed 541: 2 36 46 53 60 67 74 84 95 110 170 best: 18.647 avg: 19.322 ns/v 43 | Testing seed 552: 2608 2864 2928 2992 3056 3104 3168 3232 3312 3424 4016 best: 33.717 avg: 34.147 ns/v 44 | Testing seed 563: 144 288 336 352 368 384 416 432 464 512 688 best: 39.192 avg: 39.942 ns/v 45 | Testing seed 574: 265 617 699 775 845 904 964 1033 1125 1280 2274 best: 17.014 avg: 17.277 ns/v 46 | Testing seed 585: 289 387 415 437 459 479 497 520 544 583 750 best: 29.613 avg: 29.990 ns/v 47 | Testing seed 596: 51 94 109 121 132 143 155 167 182 206 319 best: 30.970 avg: 31.876 ns/v 48 | Testing seed 607: 381 572 626 677 715 749 790 831 884 961 1232 best: 20.738 avg: 21.320 ns/v 49 | Testing seed 618: 6399 7544 7777 7946 8082 8195 8322 8439 8618 8853 9754 best: 35.672 avg: 36.246 ns/v 50 | Testing seed 629: 60 134 155 170 183 197 208 221 239 268 387 best: 23.521 avg: 24.150 ns/v 51 | Testing seed 640: 448 640 688 720 752 784 816 848 880 928 1200 best: 34.225 avg: 34.610 ns/v 52 | Testing seed 651: 59 146 171 190 208 223 243 268 290 333 566 best: 18.315 avg: 18.883 ns/v 53 | Testing seed 662: 32 85 99 110 121 130 139 150 164 181 260 best: 21.723 avg: 22.234 ns/v 54 | Testing seed 673: 126 195 224 247 262 277 294 313 334 374 505 best: 35.235 avg: 35.793 ns/v 55 | Testing seed 684: 1729 2104 2210 2275 2327 2386 2437 2494 2571 2683 3525 best: 28.454 avg: 28.950 ns/v 56 | Testing seed 695: 448 656 704 736 752 784 816 848 880 928 1120 best: 35.224 avg: 35.789 ns/v 57 | Testing seed 706: 865 1132 1204 1248 1294 1331 1379 1417 1469 1565 1972 best: 28.569 avg: 29.017 ns/v 58 | Testing seed 717: 255 361 399 419 445 467 493 518 549 589 776 best: 34.452 avg: 35.044 ns/v 59 | Testing seed 728: 201 338 369 396 420 441 463 483 512 545 704 best: 35.545 avg: 36.247 ns/v 60 | Testing seed 739: 43 96 110 120 131 143 154 166 181 199 313 best: 21.515 avg: 22.210 ns/v 61 | Testing seed 750: 11 54 66 76 85 94 103 113 126 140 240 best: 22.999 avg: 23.485 ns/v 62 | Testing seed 761: 3 49 60 69 76 84 92 102 110 128 221 best: 13.972 avg: 14.422 ns/v 63 | Testing seed 772: 581 717 761 797 827 858 882 911 951 1003 1310 best: 35.117 avg: 35.732 ns/v 64 | Testing seed 783: 14 58 70 79 86 95 103 112 123 138 223 best: 16.662 avg: 17.030 ns/v 65 | Testing seed 794: 15 59 73 81 89 97 105 117 127 146 232 best: 16.640 avg: 16.924 ns/v 66 | Testing seed 805: 162 243 268 289 308 326 342 359 377 410 530 best: 31.511 avg: 32.074 ns/v 67 | Testing seed 816: 474 735 803 863 912 964 1014 1058 1124 1216 1799 best: 24.804 avg: 25.377 ns/v 68 | Testing seed 827: 1520 1819 1899 1966 2019 2068 2116 2166 2245 2344 2760 best: 35.131 avg: 35.657 ns/v 69 | Testing seed 838: 19 52 64 73 80 89 96 105 116 133 221 best: 15.527 avg: 15.932 ns/v 70 | Testing seed 849: 192 300 331 360 388 416 441 471 503 561 805 best: 26.132 avg: 26.652 ns/v 71 | Testing seed 860: 376 680 768 836 908 969 1040 1102 1196 1332 2114 best: 38.453 avg: 39.090 ns/v 72 | Testing seed 871: 0 19 28 34 39 46 52 59 66 78 136 best: 8.047 avg: 8.666 ns/v 73 | Testing seed 882: 137 224 254 275 293 313 330 350 377 419 567 best: 24.675 avg: 25.276 ns/v 74 | Testing seed 893: 124 232 259 281 301 320 338 363 390 431 790 best: 21.533 avg: 22.321 ns/v 75 | Testing seed 904: 661 912 993 1052 1099 1153 1208 1266 1359 1457 1959 best: 35.640 avg: 36.209 ns/v 76 | Testing seed 915: 1168 1360 1424 1488 1520 1568 1600 1648 1696 1792 2064 best: 36.120 avg: 36.671 ns/v 77 | Testing seed 926: 92 203 228 248 265 281 302 321 345 388 698 best: 20.636 avg: 21.350 ns/v 78 | Testing seed 937: 106 258 333 393 448 506 559 637 736 880 1502 best: 32.549 avg: 33.131 ns/v 79 | Testing seed 948: 25 86 101 114 124 136 149 162 177 205 298 best: 17.501 avg: 18.122 ns/v 80 | Testing seed 959: 210 341 371 398 419 440 459 481 510 552 712 best: 38.525 avg: 39.285 ns/v 81 | Testing seed 970: 538 936 1038 1127 1207 1278 1339 1432 1543 1677 2527 best: 23.046 avg: 23.730 ns/v 82 | Testing seed 981: 29 109 126 141 156 168 182 198 216 242 412 best: 13.934 avg: 14.234 ns/v 83 | Testing seed 992: 16 48 64 80 96 96 112 128 144 160 240 best: 29.828 avg: 30.243 ns/v 84 | Testing seed 1003: 1 25 33 41 47 54 59 67 76 89 146 best: 9.463 avg: 9.909 ns/v 85 | Testing seed 1014: 1104 1376 1440 1488 1520 1568 1600 1632 1696 1776 2112 best: 36.131 avg: 36.833 ns/v 86 | Testing seed 1025: 234 376 415 444 470 493 515 541 570 616 923 best: 26.763 avg: 27.169 ns/v 87 | Testing seed 1036: 147 280 312 344 373 395 419 447 478 533 829 best: 31.787 avg: 32.452 ns/v 88 | Testing seed 1047: 32 128 160 176 192 192 208 240 240 272 432 best: 36.725 avg: 37.670 ns/v 89 | Testing seed 1058: 198 406 439 465 491 516 538 568 602 649 849 best: 24.850 avg: 25.325 ns/v 90 | Testing seed 1069: 161 296 327 351 373 395 417 448 478 530 786 best: 29.744 avg: 30.307 ns/v 91 | Testing seed 1080: 62 107 123 136 148 158 171 185 204 230 346 best: 19.561 avg: 20.423 ns/v 92 | Testing seed 1091: 176 304 336 352 368 384 400 432 448 496 656 best: 37.509 avg: 38.171 ns/v 93 | Testing seed 1102: 95 203 226 244 262 279 294 309 333 361 583 best: 22.748 avg: 23.106 ns/v 94 | Testing seed 1113: 28 82 96 109 119 129 140 153 168 192 265 best: 18.394 avg: 18.939 ns/v 95 | Testing seed 1124: 743 1007 1060 1101 1141 1182 1224 1264 1320 1391 1734 best: 34.735 avg: 35.340 ns/v 96 | Testing seed 1135: 223 359 399 423 448 474 501 528 567 618 909 best: 23.604 avg: 24.235 ns/v 97 | Testing seed 1146: 169 281 311 331 349 369 390 411 435 468 672 best: 25.088 avg: 25.606 ns/v 98 | Testing seed 1157: 15 56 69 78 88 94 104 113 125 140 213 best: 18.052 avg: 18.713 ns/v 99 | Testing seed 1168: 480 640 688 720 752 784 816 848 880 944 1232 best: 37.670 avg: 38.380 ns/v 100 | Testing seed 1179: 2512 2832 2928 2992 3056 3104 3168 3232 3312 3408 3840 best: 33.031 avg: 33.527 ns/v 101 | Testing seed 1190: 21 64 74 85 94 103 112 122 133 151 266 best: 17.730 avg: 18.078 ns/v 102 | Testing seed 1201: 48 128 160 176 192 208 224 240 256 272 432 best: 41.205 avg: 42.010 ns/v 103 | Testing seed 1212: 9 56 70 79 85 93 102 111 123 143 220 best: 20.149 avg: 20.788 ns/v 104 | Testing seed 1223: 1560 1959 2056 2117 2184 2252 2311 2385 2476 2610 3325 best: 33.996 avg: 34.511 ns/v 105 | Testing seed 1234: 0 14 19 24 29 35 40 47 54 65 125 best: 8.271 avg: 8.850 ns/v 106 | Testing seed 1245: 97 215 256 282 310 336 362 393 427 496 872 best: 19.754 avg: 20.275 ns/v 107 | Testing seed 1256: 192 304 336 352 368 400 416 432 464 496 688 best: 35.758 avg: 36.078 ns/v 108 | Testing seed 1267: 1 35 45 52 59 67 75 83 91 107 187 best: 18.951 avg: 19.305 ns/v 109 | Testing seed 1278: 5376 6064 6192 6288 6384 6480 6576 6688 6816 7024 8032 best: 31.120 avg: 31.527 ns/v 110 | Testing seed 1289: 9 65 78 87 97 105 112 122 135 153 242 best: 18.327 avg: 18.678 ns/v 111 | Testing seed 1300: 0 15 22 29 35 39 45 51 59 69 116 best: 12.533 avg: 13.202 ns/v 112 | Testing seed 1311: 432 640 688 720 752 784 816 848 880 928 1264 best: 37.392 avg: 37.827 ns/v 113 | Testing seed 1322: 27 84 99 111 122 130 140 151 163 181 294 best: 22.766 avg: 23.575 ns/v 114 | Testing seed 1333: 503 712 755 783 809 830 857 884 920 963 1163 best: 32.388 avg: 32.929 ns/v 115 | Testing seed 1344: 123 234 261 285 306 327 348 376 409 454 683 best: 23.960 avg: 24.222 ns/v 116 | Testing seed 1355: 1104 1360 1424 1472 1520 1552 1600 1632 1680 1760 2080 best: 35.057 avg: 35.320 ns/v 117 | Testing seed 1366: 2464 2832 2928 2992 3056 3120 3168 3216 3296 3408 4080 best: 32.920 avg: 33.225 ns/v 118 | Testing seed 1377: 41 107 123 134 146 156 169 182 199 222 392 best: 17.754 avg: 18.307 ns/v 119 | Testing seed 1388: 89 198 224 242 259 275 292 312 333 369 528 best: 37.000 avg: 37.563 ns/v 120 | Testing seed 1399: 40 111 129 142 155 169 181 194 212 241 355 best: 30.904 avg: 31.167 ns/v 121 | Testing seed 1410: 126 232 259 280 301 323 345 367 400 441 772 best: 22.561 avg: 23.434 ns/v 122 | Testing seed 1421: 3 45 55 64 72 80 88 97 108 125 206 best: 20.578 avg: 21.322 ns/v 123 | Testing seed 1432: 15 48 59 69 78 86 94 104 116 132 264 best: 23.533 avg: 24.413 ns/v 124 | Testing seed 1443: 8 49 60 70 77 85 94 103 113 129 247 best: 14.170 avg: 14.825 ns/v 125 | Testing seed 1454: 16 52 64 74 80 88 96 105 115 133 231 best: 17.901 avg: 18.397 ns/v 126 | Testing seed 1465: 227 400 437 466 493 521 545 582 626 686 992 best: 24.977 avg: 25.645 ns/v 127 | Testing seed 1476: 74 168 190 207 221 236 251 266 288 314 503 best: 28.408 avg: 28.801 ns/v 128 | Testing seed 1487: 2384 2848 2944 3008 3056 3120 3168 3216 3280 3392 3840 best: 32.097 avg: 32.712 ns/v 129 | Testing seed 1498: 0 29 39 46 53 59 65 72 80 96 165 best: 13.133 avg: 13.879 ns/v 130 | Testing seed 1509: 55 123 143 154 166 178 190 203 220 248 374 best: 22.633 avg: 23.365 ns/v 131 | Testing seed 1520: 1072 1360 1424 1472 1520 1568 1600 1632 1680 1760 2208 best: 35.988 avg: 36.607 ns/v 132 | Testing seed 1531: 64 164 192 212 230 251 268 292 318 355 647 best: 27.413 avg: 28.090 ns/v 133 | Testing seed 1542: 57 129 149 164 178 190 203 220 243 274 408 best: 17.880 avg: 18.536 ns/v 134 | Testing seed 1553: 28 74 88 98 107 117 129 139 153 172 255 best: 18.388 avg: 18.933 ns/v 135 | Testing seed 1564: 100 179 204 223 237 252 270 285 309 344 498 best: 23.168 avg: 23.687 ns/v 136 | Testing seed 1575: 1008 1376 1440 1472 1520 1552 1600 1648 1680 1760 2064 best: 33.952 avg: 34.340 ns/v 137 | Testing seed 1586: 0 23 32 38 45 50 58 65 72 85 156 best: 10.863 avg: 11.164 ns/v 138 | Testing seed 1597: 111 223 251 271 288 307 328 346 369 401 581 best: 23.269 avg: 23.600 ns/v 139 | Testing seed 1608: 157 262 291 308 330 349 366 388 409 445 579 best: 22.879 avg: 23.563 ns/v 140 | Testing seed 1619: 68 143 165 184 201 216 230 247 265 292 432 best: 26.591 avg: 27.166 ns/v 141 | Testing seed 1630: 5392 5840 5968 6048 6128 6192 6272 6368 6496 6656 7424 best: 30.479 avg: 30.858 ns/v 142 | Testing seed 1641: 10 57 70 79 87 95 104 113 123 138 210 best: 17.004 avg: 17.366 ns/v 143 | Testing seed 1652: 383 607 682 731 777 824 874 930 1004 1105 1739 best: 25.280 avg: 25.559 ns/v 144 | Testing seed 1663: 189 311 346 366 388 411 429 448 473 509 693 best: 38.465 avg: 38.990 ns/v 145 | Testing seed 1674: 210 323 355 379 404 425 447 470 501 541 763 best: 22.258 avg: 22.870 ns/v 146 | Testing seed 1685: 39 107 125 141 152 164 176 190 206 230 379 best: 17.052 avg: 17.655 ns/v 147 | Testing seed 1696: 195 367 410 453 487 523 561 606 662 741 1126 best: 30.081 avg: 30.731 ns/v 148 | Testing seed 1707: 0 18 26 32 38 44 51 57 66 78 118 best: 9.247 avg: 9.836 ns/v 149 | Testing seed 1718: 1136 1360 1424 1488 1520 1568 1600 1648 1696 1776 2144 best: 35.863 avg: 36.313 ns/v 150 | Testing seed 1729: 100 238 278 314 345 374 412 448 504 578 967 best: 21.202 avg: 21.725 ns/v 151 | Testing seed 1740: 0 48 64 80 96 96 112 128 144 160 288 best: 30.915 avg: 31.541 ns/v 152 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Robin Hood Sort: the algorithm for uniform data 2 | 3 | Robin Hood Sort is a stable numeric sorting algorithm that achieves performance several times better than the fastest comparison sorts on uniformly random arrays, with worst-case performance similar to an unimpressive hybrid merge sort. It's in a similar category to counting sort and radix sort (and [switches](#counting-sort) to counting sort on small ranges), but works on any range unlike counting sort, and can be better than radix sort for large element sizes or small arrays. 4 | 5 | Best Average Worst Memory Stable Deterministic 6 | n n log log n n log n n Yes Yes 7 | 8 | While this repository is only for demonstration and testing purposes, it appears that RH sort is of significant practical use. Poorly performing cases can be [detected](#statistical-detection) with good probability from a random sample of `√n` values—that's an exact, not asymptotic, count—selected from an array of length `n` and then sorted. This test is integrated into pivot selection as part of [distribution crumsort](https://github.com/mlochbaum/distcrum) (a high-performance quicksort), allowing RH to be used as an alternate base case. 9 | 10 | ![Visualization](images/robinhood.gif) 11 | 12 | Compared below are merge sort [quadsort](https://github.com/scandum/quadsort), top-of-the-line hybrid quicksorts [pdqsort](https://github.com/orlp/pdqsort) and [fluxsort](https://github.com/scandum/fluxsort), and radix sorts [wolfsort](https://github.com/scandum/wolfsort) (also a bit of a hybrid) and [ska_sort_copy](https://probablydance.com/2016/12/02/investigating-radix-sort/) (note that most benchmarks elsewhere are based on the slower in-place [ska_sort](https://probablydance.com/2017/01/17/faster-sorting-algorithm-part-2/)). If you're wondering, Timsort is no good with integer arrays like this, and single-core IPS⁴o loses to the quicksorts on random data. 13 | 14 | ![Performance bar chart](images/wolf.svg) 15 |
details 16 | 17 | ```sh 18 | # To download other sorts and build 19 | ./wolfbench.sh 20 | # Perform the benchmark; save results 21 | ./runwolfbench > res/wolf.txt 22 | # Make chart (requires BQN) 23 | images/bar.bqn res/wolf.txt > images/wolf.svg 24 | ``` 25 | 26 |
27 | 28 | So Robin Hood is tested against the fastest sorting algorithms I know, on wolfsort's own benchmark suite. And ska_sort_copy is the only real contender on random data (has the sheep exchanged clothing with the wolf?). Comparison sorts never stood a chance! Robin Hood is very skilled—don't forget it—but his greatest skill is cheating. 29 | 30 | The method is based on daring optimism: allocate more than enough space for all values, then simply try to place each one where it ought to go, numerically speaking ([details below](#algorithm)). If there's no room… then Robin Hood must rely on trickery. First, here's how it scales on random data. 31 | 32 | ![Performance line plot](images/rand.svg) 33 |
details 34 | 35 | ```sh 36 | # Perform the benchmarks; save results 37 | gcc -O3 -D NOTEST bench.c && ./a.out l > res/r_rh.txt 38 | gcc -O3 -D FLUXSORT -D NOTEST bench.c && ./a.out l > res/r_flux.txt 39 | gcc -O3 -D WOLFSORT -D NOTEST bench.c && ./a.out l > res/r_wolf.txt 40 | g++ -w -fpermissive -O3 -D SKACOPY -D NOTEST bench.c && ./a.out l > res/r_ska_.txt 41 | 42 | # Make chart (requires BQN) 43 | images/line.bqn res/r_{flux,wolf,ska_,rh}.txt > images/rand.svg 44 | ``` 45 | 46 |
47 | 48 | Robin Hood is without equal under length 100,000 for this good—but not best—case. Beyond that the picture is not so nice because of the large buffer and scattershot approach in filling it. My L2 cache fits 128k 4-byte integers, and there's a clear increase as it's overwhelmed. But for the intended use this is no concern: fluxsort (like any quicksort) works by splitting the array in half, and sorting recursively. Switching to Robin Hood at the appropriate length would make an overall algorithm with fluxsort's very clean cache usage, and reduce the required buffer size as well. Imagine cutting off the right side of the fluxsort graph to affix it to Robin Hood's line lower down. The Quad Merge Robin [variant](#variations) does this with quadsort, and my [distribution crumsort](https://github.com/mlochbaum/distcrum) includes code to shuffle pivots around (not stable) to keep track of the range of subarrays with just a quick initial pass, then a few swaps per partition. 49 | 50 | As you might guess, "Asc. tiles" from the bar chart hints at a significant problem: the range can be clumpy instead of uniform. Here's my best guess at a worst case for RH sort: one very large element followed by many random small ones. 51 | 52 | ![Breakdown broken down](images/bad.svg) 53 |
details 54 | 55 | ```sh 56 | # Do benchmark 57 | gcc -O3 -D WORST -D PROFILE -D NOTEST bench.c && ./a.out l > res/sp_rh.txt 58 | gcc -O3 -D WORST -D MERGESORT -D NOTEST bench.c && ./a.out l > res/s_merge.txt 59 | # For comparison: don't use worst case because in flux it's better 60 | gcc -O3 -D QUADSORT -D NOTEST bench.c && ./a.out l > res/s_quad.txt 61 | gcc -O3 -D FLUXSORT -D NOTEST bench.c && ./a.out l > res/r_flux.txt 62 | # Make image 63 | images/line.bqn res/{sp_rh,s_merge,s_quad,r_flux}.txt > images/bad.svg 64 | ``` 65 | 66 |
67 | 68 | All these small elements get sent to the same index (with maybe a slight difference for very large arrays). But the Robin Hood algorithm has a mechanism to rescue them from the buffer, 16 values at a time. These values are then merge sorted. The buffer insertion amounts to an insertion sort on these blocks of 16, which is why it can be said that RH degrades to a bad hybrid merge sort. 69 | 70 | The RH graph here is split into a few parts. The large ones are insertion near the bottom and merging at the top. Also shown is a pure merge sort based on the same merging code. This merging code is not fast (for now at least), but if you've downloaded quadsort using `./wolfbench.sh`, you can build with `-D QUADMERGE` to use it instead (see [variations](#variations)). Then it comes within about 10ns/value of quadsort—the 20ns/v of insertion work pays for itself and then some relative to the poorly optimized merge sort, but is only worth about 10ns/v of quadsort time. 71 | 72 | Horrible cases like this are easily detectable in a quicksort during median selection. Figuring out how bad it can get without being detectable from a sample will require some more benchmarks and statistics. 73 | 74 | ## Algorithm 75 | 76 | The main idea of Robin Hood Sort is to allocate a buffer a few times (2.5 or more) larger than the array to be sorted, then send each entry to the appropriate position in the buffer for its value. To get this position, subtract the minimum value, then bitshift, with the shift amount chosen based on the range. If there's no room in that spot, move entries forward to find space for it, keeping everything in sorted order (that is, place the new entry after any smaller or equal ones, shifting larger ones forward by 1 to make room). Afterwards, the entries are put back in the list with branchless filtering to remove the gaps. The algorithm is named for [Robin Hood hashing](https://programming.guide/robin-hood-hashing.html), which does exactly the same thing to find space for hashes within a hash table. In fact, as Paul Khuong [writes](https://pvk.ca/Blog/2019/09/29/a-couple-of-probabilistic-worst-case-bounds-for-robin-hood-linear-probing/), "I like to think of Robin Hood hash tables with linear probing as arrays sorted on uniformly distributed keys, with gaps". The name for RH hashing comes from the idea that those shifted entries are "rich" because they got there first, and the new entry gets to rob a space from them. I don't think this is the best metaphor, [as I explained](https://youtu.be/paxIkKBzqBU?t=1340) in a talk on searching. 77 | 78 | Robin Hood hashing works on the assumption the hashes are uniformly distributed, but the real Robin Hood doesn't live in some kind of communist utopia. Robin Hood Sort works even in non-ideal conditions. The strategy is that when an area gets particularly rich in array values, some of them are scooped up and moved back to the beginning of the original array, which we're not using any more. These stolen values will later be merged—hey, you know, this is much more like Robin Hood than that other part. Let's pretend this is where the name comes from. Anyway, once all the entries from the original array have been placed, the remaining ones from the buffer are moved back, after the stolen ones. Then the stolen values are merged (as in merge sort) with each other, and one last merge with the remaining values completes the sort. 79 | 80 | Stealing happens after a value is inserted to the buffer. It's triggered based on the number of positions touched, starting at the target position and ending at the last value pushed out of the way. The simplest form is to move the entire chain. This starts at the first nonempty value, possibly before the target position, to avoid moving a value before one that's equal to it and breaking stability (besides, stealing more values tends to be faster overall). But a modification speeds things up: round down to the nearest multiple of a configurable block size, which is a power of two. Then because every block is sorted, the first few passes of merge sorting on stolen values can be skipped. Since chains in the buffer are constructed by a process that's basically insertion sort, we should expect overall performance in the worst case to resemble a hybrid merge sort, where smaller merges are skipped in favor of insertion sort. Here, the block size is set to 16. The initial threshold for stealing is twice the block size, because merging any number of stolen values with the rest of the array is costly, and it's reduced to the block size the first time any values are stolen. 81 | 82 | ![Performance breakdown](images/parts.svg) 83 |
details 84 | 85 | ```sh 86 | # Do benchmark 87 | gcc -O3 -D PROFILE -D NOTEST bench.c && ./a.out l > res/rp_rh.txt 88 | # For comparison 89 | gcc -O3 -D FLUXSORT -D NOTEST bench.c && ./a.out l > res/r_flux.txt 90 | # Make image 91 | images/line.bqn res/r{_flux,p_rh}.txt > images/parts.svg 92 | ``` 93 | 94 |
95 | 96 | Time taken in each section: from bottom to top, range-finding, buffer initialization, insertion, and filtering. Merging is rare for random data and doesn't appear on this graph. 97 | 98 | ## Analysis 99 | 100 | Here we'll show that Robin Hood Sort can achieve O(n log log n) *average* time on random arrays with range at least twice the length (for smaller ranges, use counting/bucket sort and get guaranteed O(n) time), and O(n log(n)) worst-case time. And that it's stable. For practical purposes, that average time is effectively linear. Unfortunately, for practical purposes it's also not achieved: these times are based on the typical random-access model where reading or writing any value in memory takes constant time. Real-world random access scales more like `√n` as various levels of cache get exhausted, and this causes RH sort to lose ground against quicksorts as `n` increases, instead of gaining as theory would suggest. 101 | 102 | Here are the steps we need to consider: 103 | * Find the minimum and maximum 104 | * Initialize the buffer 105 | * Insert to the buffer 106 | * Move/filter elements back from the buffer 107 | * Merge sort stolen blocks 108 | * Final merge 109 | 110 | In fact, every step of this except the merge sort is O(n). This is obvious for the range-finding and final merge, as they're done in one pass. Same for initializing and filtering the buffer, provided the length of the buffer is O(n). But it's at most `5*n` by construction, so that's handled. The only parts left are the buffer insertion, and that merge sort. 111 | 112 | It's not obvious, but buffer insertion is kept to O(n b) by the block-stealing mechanism. When a value is inserted, it could touch any number of positions in the buffer. But if this number is greater than some fixed threshold `b`, some values will be removed. Let's split the cost into two parts: cost pertaining to values that stay, and to values that are then stolen. Less than `b` values can stay on each insertion, so this portion of the cost is under `n*b` entries. And each value can be removed only once, so the total cost pertaining to removed values is proportional to `n`. The sum of the two parts is then proportional to `n*b`. Even with a variable `b` as discussed below, it's less than log(n) and doesn't affect the O(n log(n)) worst case. 113 | 114 | Merge sort has a well known worst case bound of O(n log n). This is where most of our cost comes from: in the worst case, nearly all values are snatched from the buffer and the cost reaches that bound. However, the expectation for uniform distributions is different: very few blocks should be stolen. To figure out what exactly the consequences are, we need to know something about the statistics of Robin Hood hashing. 115 | 116 | ### Average case 117 | 118 | The `n log log n` bound on average time requires setting a variable block size. But the size scales with `log log n`, that is, it hardly scales at all. And it's silly to apply RH sort to very large random arrays because the cache patterns are horrible. So I'm sticking with a fixed block size in the actual implementation. 119 | 120 | From lemma 5.4 [here](http://opendatastructures.org/ods-python/5_2_LinearHashTable_Linear_.html), the probability that a chain of length `k` starts at any particular position is at most `cᵏ`, for a constant `c` that depends on the load factor and is less than 1 for a half-full table (specifically, `α*exp(α)` for load factor `α`). A random position hits such a chain with probability proportional to its length, so that the total expected number of elements moved on one insertion is `b²*c^b` (we also need to consider chains longer than `b`, but these only contribute a constant factor). Setting `b = -2*log_c(log₂(n))`, which is positive since `c<1`, `c^b` is `1/log²(n)`, giving a theft-per-insertion expectation of `b²/log²n < 1/log n`. Now the total asymptotic cost is the cost of insertions plus the cost of merge sorting those blocks: 121 | 122 | cost < n*b + (n/log n)log(n/log n) 123 | < n*b + (n/log n)log(n) 124 | = n*b + n 125 | 126 | The total time is now proportional to `n log log n`. 127 | 128 | ### Stability 129 | 130 | Sorting stability means that two equal values in the initial array maintain their order in the sorted array, that is, a value with a lower index isn't swapped with an equal value with a higher index. For the function `rhsort32` implemented here, it's meaningless: an integer can't remember where it came from. However, stability can be important when sorting one array of keys according to another array of values. Robin Hood Sort could probably be adapted to this use case, so it's nice that it's stable without any significant performance cost. 131 | 132 | During buffer insertion, the values in the buffer have lower indices than those yet to be inserted. So an inserted value needs to be placed after any equal values it encounters. No problem. 133 | 134 | Stolen values are placed at the beginning of the array, which doesn't reorder them with respect to an uninserted values but could break ordering with respect to the buffer. Equal values are always adjacent in the buffer, so we just need to be sure Robin Hood doesn't grab a trailing component of a group of equal values. The fix is to walk back to the beginning of the chain (or at least the previous unequal value, but this benchmarks worse) before stealing. 135 | 136 | After insertion, filtering is obviously stable, and merging is well known to be stable. 137 | 138 | Since stability's meaningless for pure numeric sorting, `rhsort32` does include some non-stable optimizations. First, it uses counting sort for small ranges, which must be changed to bucket sort, that is, it should actually move values from the initial array instead of just reconstructing them from counts. Second, it uses the array's maximum for an "empty space" sentinel value in the buffer, and infers that any missing elements after filtering were equal to it. For a sort-by operation, it should be modified to one plus the maximum: it's fine if this overflows, but the full-range case where both minimum and maximum possible values are present has to be excluded somehow. 139 | 140 | ## Variations 141 | 142 | The following options can be applied when compiling rhsort.c: 143 | 144 | - Brave Robin (`-D BRAVE`) disables block stealing, leading to O(n) average case performance (like hash table insertion) and O(n²) worst case. It needs to allocate and initialize more memory because overflows can be longer, leading to slower actual performance except at small sizes. 145 | - Quad Robin (`-D QUADMERGE`) uses quadsort's methods for merging stolen blocks together, making the worst case significantly better, only 10 to 20ns/value worse than quadsort. 146 | - Merge Robin (function `rhmergesort32`) is an O(n log(n)) merge sort hybrid that uses Robin Hood sort for sizes below 216, then merges these units together. It's faster for large arrays, but only if `QUADMERGE` is also used. 147 | 148 | ## Counting sort 149 | 150 | Because it relies on shifting values right to fit the range into the buffer, Robin Hood Sort begins to degrade when the range gets smaller than the buffer should be. This could be fixed with another loop that shifts left, but these ranges are a natural fit for counting sort, so it switches to that instead. To write the values out after counting, a normal loop is used for high densities, while a strategy based on prefix sums is used for lower ones (it can be sped up [with SIMD instructions](https://en.algorithmica.org/hpc/algorithms/prefix/), but I haven't found a way to generate code like this without compiler intrinsics). 151 | 152 | ![Small-range performance](images/range.svg) 153 |
details 154 | 155 | ``` 156 | gcc -D RANGES -O3 bench.c && ./a.out l 32 > res/c_rh.txt 157 | gcc -D RANGES -D FLUXSORT -O3 bench.c && ./a.out l 32 > res/c_flux.txt 158 | gcc -D RANGES -D QUADSORT -O3 bench.c && ./a.out l 32 > res/c_quad.txt 159 | 160 | # Make image 161 | images/line.bqn res/c_{quad,flux,rh}.txt > images/range.svg 162 | ``` 163 | 164 |
165 | 166 | ## Statistical detection 167 | 168 | When Robin Hood performs poorly, the root cause is too many buffer collisions, that is to say, too many values that are close to each other in the buffer. But if that's the case, it should show in a random sample of enough values from the array. As it turns out, a sample of size `√n` is enough, because each of the roughly `n/2` *pairs* of values can be tested. The metric, or score, used considers the block size of 16 to be the threshold for a pair to count as a collision. If the distance `d` between a pair of samples is less than that, then it counts for `16-d` points in the score. The sum over all pairs can be computed efficiently by sorting the sample, then scanning over it (this would get slow with many samples close to each other, except that we can exit early if the total exceeds the maximum allowed). Because the score can only increase as more samples are added, an option is to start with a smaller number of samples but add more if collisions aren't found, keeping the sampling cost low on poor distributions with some extra sample-merging overhead for larger ones. 169 | 170 | In the following tests, piecewise-linear distributions are generated using a strategy that allows for large dense regions and gaps. On each distribution, a single average sorting time is measured, and many samples are taken to find a distribution of scores. The test is done with quadsort's merge (Quad Robin) because that's most relevant to practical application. The uniform cases at the very bottom of the graphs are reasonably well separated from the ones that do worse than fluxsort. A score threshold between 70 and 100 is sufficient to accept the good cases while keeping the chance of slowdown on a bad array small. 171 | 172 | ![Criterion performance 100/10000](images/crit.svg) 173 |
details 174 | 175 | ``` 176 | gcc -D QUADMERGE -D NOTEST -O3 crit.c && ./a.out > res/crit.txt 177 | images/scatter.bqn res/crit.txt > images/crit.svg 178 | ``` 179 | 180 |
181 | 182 | ![Criterion performance 316/100000](images/crit5.svg) 183 |
details 184 | 185 | ``` 186 | gcc -D LENGTH=100000 -D QUADMERGE -D NOTEST -O3 crit.c && ./a.out > res/crit5.txt 187 | images/scatter.bqn res/crit5.txt > images/crit5.svg 188 | ``` 189 | 190 |
191 | 192 | ## History 193 | 194 | Robin Hood hashing is fairly well known to algorithms enthusiasts. I encountered it in 2017 or 2018 when working on search algorithms for Dyalog APL, and later presented a SIMD-oriented version in [a talk](https://youtu.be/paxIkKBzqBU) ([zipped slides](https://www.dyalog.com/uploads/conference/dyalog18/presentations/D08_Searches_Using_Vector_Instructions.zip)) on the algorithms eventually released in Dyalog 18.0. This implementation features the idea of dropping entries with too great of an offset from the hash table in order to protect against inputs engineered to defeat the hash. It guarantees linear insertion performance because the maximum offset is also the greatest number of times an element can be touched during insertion. But for sorting it would require storing or recomputing the offset, and the degradation with non-uniform inputs is much worse. 195 | 196 | What I now call Brave Robin sort was originally used in a method of generating simple random samples in 2021: see the algorithm for `Subset` with Floyd's method [here](https://mlochbaum.github.io/BQN/implementation/primitive/random.html#simple-random-sample). [Dzaima](https://github.com/dzaima) and I developed the Knuth version when [Elias Mårtenson](https://github.com/lokedhs) asked about generating samples in his array language KAP—dzaima remarked that "the hashmap doesn't really need to hash, as the keys are already guaranteed random". The next day I hit on the idea of using a bitshift as an order-preserving hash and the Robin Hood insertion strategy to build a "sorted not-hash" table, resulting in [this implementation](https://github.com/dzaima/CBQN/pull/1). 197 | 198 | I'd also done a fair bit of research on sorting by this time. I'd made some improvements to pdqsort at Dyalog, mostly described in my [sorting notes](https://mlochbaum.github.io/BQN/implementation/primitive/sort.html), and more recently had [worked](https://github.com/scandum/fluxsort/issues/1) with [scandum](https://github.com/scandum) on some details of fluxsort. So I found the random subset algorithm interesting as a specialized sorting algorithm, but the possibility of an O(n²) bad case seemed too dangerous. Eventually I considered trying to improve this worst case, and figured out the block-stealing mechanism within a day or two. Seemed very promising; I wrote the code and started benchmarking pretty much immediately. 199 | --------------------------------------------------------------------------------