├── speed.pdf ├── plots ├── ones.png ├── 90pctail.png ├── 99pctail.png ├── random.png ├── reverse.png ├── sorted.png ├── 80pcsorted.png ├── 90pcsorted.png ├── 99pcsorted.png ├── many-dupes.png ├── 99.9pcsorted.png ├── few-spikes-with-noise.png └── speed_png.plot ├── .gitignore ├── LICENSE ├── Makefile ├── timer.h ├── strings.cpp ├── README.md ├── progress_bar.h ├── sort.cpp ├── benchmark.h ├── speed.plot ├── ssssort.h └── stats.txt /speed.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lorenzhs/ssssort/HEAD/speed.pdf -------------------------------------------------------------------------------- /plots/ones.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lorenzhs/ssssort/HEAD/plots/ones.png -------------------------------------------------------------------------------- /plots/90pctail.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lorenzhs/ssssort/HEAD/plots/90pctail.png -------------------------------------------------------------------------------- /plots/99pctail.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lorenzhs/ssssort/HEAD/plots/99pctail.png -------------------------------------------------------------------------------- /plots/random.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lorenzhs/ssssort/HEAD/plots/random.png -------------------------------------------------------------------------------- /plots/reverse.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lorenzhs/ssssort/HEAD/plots/reverse.png -------------------------------------------------------------------------------- /plots/sorted.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lorenzhs/ssssort/HEAD/plots/sorted.png -------------------------------------------------------------------------------- /plots/80pcsorted.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lorenzhs/ssssort/HEAD/plots/80pcsorted.png -------------------------------------------------------------------------------- /plots/90pcsorted.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lorenzhs/ssssort/HEAD/plots/90pcsorted.png -------------------------------------------------------------------------------- /plots/99pcsorted.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lorenzhs/ssssort/HEAD/plots/99pcsorted.png -------------------------------------------------------------------------------- /plots/many-dupes.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lorenzhs/ssssort/HEAD/plots/many-dupes.png -------------------------------------------------------------------------------- /plots/99.9pcsorted.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lorenzhs/ssssort/HEAD/plots/99.9pcsorted.png -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | qsort/ 2 | sort 3 | sort_* 4 | strings 5 | strings_* 6 | G* 7 | perf.* 8 | *.txt 9 | *.pdf -------------------------------------------------------------------------------- /plots/few-spikes-with-noise.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lorenzhs/ssssort/HEAD/plots/few-spikes-with-noise.png -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2016 Lorenz Hübschle-Schneider 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | FLAGS=-std=c++14 -Wall -Wextra -Werror 2 | # if you want to be pedantic, uncomment this line (for CXX=clang++): 3 | #FLAGS+=-Weverything -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-shadow -Wno-global-constructors -Wno-padded 4 | 5 | all: release 6 | 7 | debug: 8 | $(CXX) $(FLAGS) -O0 -ggdb -o sort sort.cpp 9 | 10 | release: 11 | $(CXX) $(FLAGS) -O3 -g -DNDEBUG -o sort sort.cpp 12 | 13 | string_debug: 14 | $(CXX) $(FLAGS) -O0 -ggdb -o strings strings.cpp 15 | 16 | string_release: 17 | $(CXX) $(FLAGS) -O3 -g -DNDEBUG -o strings strings.cpp 18 | 19 | run: 20 | ./sort 21 | 22 | 23 | _sqlplot: 24 | sp-process speed.plot 25 | 26 | _gnuplot: 27 | gnuplot speed.plot 28 | 29 | _fixup: 30 | # fixup standard deviation lines 31 | sed -i 's/title "algo=ssssort,a=\(-\)*1"\( notitle ls 3\)* with lines\(points\)*/notitle ls 3 with lines/' speed.plot 32 | sed -i 's/title "algo=stdsort,a=\(-\)*1"\( notitle ls 5\)* with lines\(points\)*/notitle ls 5 with lines/' speed.plot 33 | # set average line style 34 | sed -i 's/title "algo=ssssort,a=0"\( ls 4\)*/title "ssssort" ls 4/' speed.plot 35 | sed -i 's/title "algo=stdsort,a=0"\( ls 6\)*/title "std::sort" ls 6/' speed.plot 36 | 37 | plot: _sqlplot _fixup _gnuplot 38 | -------------------------------------------------------------------------------- /timer.h: -------------------------------------------------------------------------------- 1 | /******************************************************************************* 2 | * timer.h 3 | * 4 | * A flexible wall-clock timer 5 | * 6 | ******************************************************************************* 7 | * Copyright (C) 2016 Lorenz Hübschle-Schneider 8 | * 9 | * The MIT License (MIT) 10 | * 11 | * Permission is hereby granted, free of charge, to any person obtaining a copy 12 | * of this software and associated documentation files (the "Software"), to deal 13 | * in the Software without restriction, including without limitation the rights 14 | * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 15 | * copies of the Software, and to permit persons to whom the Software is 16 | * furnished to do so, subject to the following conditions: 17 | * 18 | * The above copyright notice and this permission notice shall be included in 19 | * all copies or substantial portions of the Software. 20 | 21 | * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 22 | * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 23 | * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 24 | * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 25 | * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 26 | * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 27 | * SOFTWARE. 28 | ******************************************************************************/ 29 | 30 | 31 | #pragma once 32 | 33 | #include 34 | 35 | /// A flexible timer. TimeT is the precision of the timing, while scalingFactor 36 | /// is the factor by which the output will be scaled. The default is to print 37 | /// return milliseconds with microsecond precision. 38 | template 39 | struct TimerT { 40 | TimerT() { 41 | reset(); 42 | } 43 | 44 | void reset() { 45 | start = std::chrono::system_clock::now(); 46 | } 47 | 48 | return_type get() const { 49 | TimeT duration = std::chrono::duration_cast(std::chrono::system_clock::now() - start); 50 | return (duration.count() * 1.0) / scalingFactor; 51 | } 52 | 53 | return_type get_and_reset() { 54 | auto t = get(); 55 | reset(); 56 | return t; 57 | } 58 | 59 | private: 60 | std::chrono::system_clock::time_point start; 61 | }; 62 | 63 | /// A timer that is accurate to microseconds, formatted as milliseconds 64 | typedef TimerT Timer; 65 | -------------------------------------------------------------------------------- /strings.cpp: -------------------------------------------------------------------------------- 1 | /******************************************************************************* 2 | * sort_strings.cpp 3 | * 4 | * Test runner for strings 5 | * 6 | ******************************************************************************* 7 | * Copyright (C) 2016 Lorenz Hübschle-Schneider 8 | * 9 | * The MIT License (MIT) 10 | * 11 | * Permission is hereby granted, free of charge, to any person obtaining a copy 12 | * of this software and associated documentation files (the "Software"), to deal 13 | * in the Software without restriction, including without limitation the rights 14 | * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 15 | * copies of the Software, and to permit persons to whom the Software is 16 | * furnished to do so, subject to the following conditions: 17 | * 18 | * The above copyright notice and this permission notice shall be included in 19 | * all copies or substantial portions of the Software. 20 | 21 | * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 22 | * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 23 | * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 24 | * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 25 | * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 26 | * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 27 | * SOFTWARE. 28 | ******************************************************************************/ 29 | 30 | 31 | #include 32 | #include 33 | #include 34 | #include 35 | #include 36 | #include 37 | #include 38 | 39 | #include "benchmark.h" 40 | 41 | int main(int argc, char *argv[]) { 42 | size_t outer_its = 5, inner_its = 3; 43 | if (argc > 1) outer_its = static_cast(atol(argv[1])); 44 | if (argc > 2) inner_its = static_cast(atol(argv[2])); 45 | 46 | std::string input_file = "input.txt"; 47 | if (argc > 3) input_file = std::string{argv[3]}; 48 | // Read input file 49 | std::ifstream input(input_file); 50 | std::vector lines; 51 | std::string line; 52 | while (getline(input, line)) { 53 | if (line.empty()) continue; 54 | lines.emplace_back(line); 55 | } 56 | 57 | std::string stat_file = "stats_strings.txt"; 58 | if (argc > 3) stat_file = std::string{argv[3]}; 59 | std::ofstream *stat_stream = nullptr; 60 | if (stat_file != "-") { 61 | stat_stream = new std::ofstream; 62 | stat_stream->open(stat_file); 63 | } 64 | 65 | using data_t = std::string; 66 | 67 | sized_benchmark_generator([&lines](auto data, size_t size){ 68 | size_t num_lines = std::min(size, lines.size()); 69 | std::copy(lines.cbegin(), lines.cbegin() + num_lines, data); 70 | return num_lines; 71 | }, "file", outer_its, inner_its, stat_stream, true); 72 | 73 | /* 74 | benchmark_generator([](auto data, size_t size){ 75 | using T = std::remove_reference_t; 76 | std::mt19937 rng{ std::random_device{}() }; 77 | for (size_t i = 0; i < size; ++i) { 78 | data[i] = "";//static_cast(rng()); 79 | } 80 | }, "random", iterations, stat_stream); 81 | */ 82 | 83 | 84 | if (stat_stream != nullptr) { 85 | stat_stream->close(); 86 | delete stat_stream; 87 | } 88 | } 89 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # ssssort — Super Scalar Sample Sort 2 | 3 | [Super Scalar Sample 4 | Sort](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.72.366&rep=rep1&type=pdf) 5 | is a sorting algorithm optimized for modern-ish hardware. This is an 6 | implementation in C++14. It is faster than `std::sort` in many 7 | cases, often only half to two thirds of the time. However, ssssort 8 | uses quite a bit of additional memory (up to 2-3x input size). 9 | This means that it's not applicable in all situations, but when 10 | it is, it's pretty quick! 11 | 12 | **But you shouldn't use this code** because there are even quicker methods around. In particular, [IPS⁴o](https://github.com/ips4o/ips4o) outperforms this code almost all of the time (benchmarks can be found in [the paper](https://arxiv.org/pdf/2009.13569.pdf), which is freely available) *and* doesn't have as much memory overhead. You should go and use IPS⁴o instead, and disregard this repository. Its main purpose is to serve as an implementation of Super Scalar Sample Sort for those wo want to compare their sorter to SSSS specifically. 13 | 14 | ### tl;dr: use [IPS⁴o](https://github.com/ips4o/ips4o) instead of this code if you want a fast sorting algorithm 15 | 16 | ## Benchmarks 17 | 18 | We performed some tests with sorting integers and compared Super Scalar Sample 19 | Sort to `std::sort`. Most notably, when sorting random integers, our 20 | implementation ran in 50 to 65% of the time taken by `std::sort`! The plot below 21 | shows the time divided by `n log(n)`, where `n` is the input size. We chose this 22 | normalization because that's the lower bound on comparison-based sorting you may 23 | remember from your algorithms class. Thus the plot shows the time spent per 24 | required comparison. 25 | 26 | ![sorting random integers](plots/random.png) 27 | 28 | `std::sort` is awfully fast on data that is already sorted (both 29 | [in the right](plots/sorted.png) and [reverse order](plots/reverse.png)). We 30 | can't match that. However, as soon as even 0.1% of elements aren't in the right 31 | place, its advantage [breaks down immediately](plots/99.9pcsorted.png)! This is 32 | also true when the first 99% of the array are sorted, and only the 33 | [last 1% contains random data](plots/99pctail.png). 34 | 35 | You can find plots for some more workloads in the [plots](plots/) folder, or 36 | suggest new benchmarks by filing an issue. The file [speed.pdf](speed.pdf) also 37 | contains the same plots with additional ±1 standard deviation lines. 38 | 39 | We performed our experiments on a Haswell Core-i7 4790T machine with 16 GiB of 40 | DDR3-1600, but only used one core to keep things reproducible. All numbers are 41 | averages over several runs - for the randomized inputs, 100 different inputs for 42 | the small instances down to 25 different ones for the larger ones, each with 10 43 | repetitions. For the deterministic input generators, we ran between 1000 and 44 | 100 iterations, again depending on input size. You can find the exact logic in 45 | [benchmark.h](benchmark.h). 46 | 47 | ## Usage 48 | 49 | Just include `ssssort.h` and use `ssssort::ssssort(Iterator begin, Iterator end)`. 50 | Or, if you want the output to be written somewhere else, use the version with 51 | three iterators: `ssssort::ssssort(InputIt begin, InputIt end, OutputIt out_begin)`. 52 | Note that the input range will be in an arbitrary order after calling this. 53 | 54 | ## Implementation 55 | 56 | The implementation is fairly close to the paper, but uses `std::sort` as base 57 | case for sorting less than 1024 elements. As-is the code technically requires a 58 | C++14 compiler, even though `g++` is happy to compile it with `-std=c++11`. The 59 | requirement stems from the use of a variable declaration in the `find_bucket` 60 | function, which is marked `constexpr`. You can simply replace `constexpr` with 61 | `inline` to make it valid C++11. 62 | -------------------------------------------------------------------------------- /progress_bar.h: -------------------------------------------------------------------------------- 1 | /******************************************************************************* 2 | * progress_bar.h 3 | * 4 | * Progress Bar utility 5 | * 6 | ******************************************************************************* 7 | * Copyright (C) 2016 Lorenz Hübschle-Schneider 8 | * 9 | * The MIT License (MIT) 10 | * 11 | * Permission is hereby granted, free of charge, to any person obtaining a copy 12 | * of this software and associated documentation files (the "Software"), to deal 13 | * in the Software without restriction, including without limitation the rights 14 | * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 15 | * copies of the Software, and to permit persons to whom the Software is 16 | * furnished to do so, subject to the following conditions: 17 | * 18 | * The above copyright notice and this permission notice shall be included in 19 | * all copies or substantial portions of the Software. 20 | 21 | * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 22 | * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 23 | * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 24 | * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 25 | * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 26 | * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 27 | * SOFTWARE. 28 | ******************************************************************************/ 29 | 30 | 31 | #pragma once 32 | 33 | #include 34 | 35 | /// A simple progress bar 36 | class progress_bar { 37 | public: 38 | /// Create a new progress bar 39 | /// \param max the value that constitutes 100% 40 | /// \param out the output stream to draw the progress bar on 41 | /// \param barwidth the width of the bar in characters 42 | progress_bar(const unsigned long long max, const std::string &extra, 43 | std::ostream &out = std::cout, int barwidth = 70) 44 | : out(out) 45 | , extra(extra) 46 | , max(max) 47 | , pos(0) 48 | , lastprogress(-1) 49 | , barwidth(barwidth) 50 | , do_draw((out.rdbuf() == std::cout.rdbuf() || out.rdbuf() == std::cerr.rdbuf())) {} 51 | 52 | /// increase progress by 1 step (not percent!) 53 | void step() { 54 | ++pos; 55 | draw(); 56 | } 57 | 58 | /// set progress to a position 59 | /// \param newpos position to set the progress to (steps, not percent!) 60 | void stepto(unsigned long long newpos) { 61 | pos = newpos; 62 | draw(); 63 | } 64 | 65 | void operator++() { 66 | step(); 67 | } 68 | 69 | /// remove all traces of the bar from the output stream 70 | void undraw() { 71 | out << "\r"; 72 | // "[" + "] " + percent (3) + " %" = up to 8 chars 73 | int width = barwidth + 8 + static_cast(extra.length()); 74 | for (int i = 0; i < width; ++i) { 75 | out << " "; 76 | } 77 | out << "\r"; 78 | } 79 | 80 | void set_extra(const std::string &new_extra) { 81 | undraw(); 82 | extra = new_extra; 83 | draw(); 84 | } 85 | 86 | protected: 87 | // adapted from StackOverflow user "leemes": 88 | // http://stackoverflow.com/a/14539953 89 | /// Draw the progress bar to the output stream 90 | void draw() { 91 | if (!do_draw) return; 92 | int progress = static_cast((pos * 100) / max); 93 | if (progress == lastprogress) return; 94 | 95 | out << extra << "["; 96 | int pos = barwidth * progress / 100; 97 | for (int i = 0; i < barwidth; ++i) { 98 | if (i < pos) 99 | out << "="; 100 | else if (i == pos) 101 | out << ">"; 102 | else 103 | out << " "; 104 | } 105 | out << "] " << progress << " %\r"; 106 | out.flush(); 107 | 108 | lastprogress = progress; 109 | } 110 | 111 | private: 112 | std::ostream &out; 113 | std::string extra; 114 | unsigned long long max; 115 | unsigned long long pos; 116 | int lastprogress; 117 | const int barwidth; 118 | const bool do_draw; 119 | }; 120 | -------------------------------------------------------------------------------- /plots/speed_png.plot: -------------------------------------------------------------------------------- 1 | # IMPORT-DATA stats ../stats.txt 2 | 3 | set terminal pngcairo enhanced font 'Lato,10' 4 | 5 | set style line 11 lc rgb "#333333" lt 1 6 | set border 3 back ls 11 7 | set tics nomirror 8 | # define grid 9 | set style line 12 lc rgb "#333333" lt 0 lw 1 10 | set grid back ls 12 11 | 12 | set grid xtics ytics 13 | 14 | set key top left 15 | 16 | set yrange [0:3.5] 17 | 18 | set xlabel 'Item Count [log_2(n)]' 19 | set ylabel 'Run Time / n log_2n [Nanoseconds]' 20 | 21 | #SQL DELETE FROM stats WHERE LOG(2, size) < 12 22 | 23 | set output "random.png" 24 | set title 'Super Scalar Sample Sort Test: Random' 25 | ## MULTIPLOT(algo) SELECT LOG(2, size) AS x, 26 | ## AVG(time) / (size * log(2, size)) * 1e6 AS y, 27 | ## MULTIPLOT 28 | ## FROM stats WHERE name = "random" 29 | ## GROUP BY MULTIPLOT, x ORDER BY MULTIPLOT, x 30 | plot \ 31 | 'speed_png-data.txt' index 0 title "ssssort" with linespoints, \ 32 | 'speed_png-data.txt' index 1 title "std::sort" with linespoints 33 | 34 | 35 | set output "80pcsorted.png" 36 | set title 'Super Scalar Sample Sort Test: 80% Sorted' 37 | ## MULTIPLOT(algo) SELECT LOG(2, size) AS x, 38 | ## AVG(time) / (size * log(2, size)) * 1e6 AS y, 39 | ## MULTIPLOT 40 | ## FROM stats WHERE name = "80pcsorted" 41 | ## GROUP BY MULTIPLOT, x ORDER BY MULTIPLOT, x 42 | plot \ 43 | 'speed_png-data.txt' index 2 title "ssssort" with linespoints, \ 44 | 'speed_png-data.txt' index 3 title "std::sort" with linespoints 45 | 46 | 47 | set output "90pcsorted.png" 48 | set title 'Super Scalar Sample Sort Test: 90% Sorted' 49 | ## MULTIPLOT(algo) SELECT LOG(2, size) AS x, 50 | ## AVG(time) / (size * log(2, size)) * 1e6 AS y, 51 | ## MULTIPLOT 52 | ## FROM stats WHERE name = "90pcsorted" 53 | ## GROUP BY MULTIPLOT, x ORDER BY MULTIPLOT, x 54 | plot \ 55 | 'speed_png-data.txt' index 4 title "ssssort" with linespoints, \ 56 | 'speed_png-data.txt' index 5 title "std::sort" with linespoints 57 | 58 | 59 | set output "99pcsorted.png" 60 | set title 'Super Scalar Sample Sort Test: 99% Sorted' 61 | ## MULTIPLOT(algo) SELECT LOG(2, size) AS x, 62 | ## AVG(time) / (size * log(2, size)) * 1e6 AS y, 63 | ## MULTIPLOT 64 | ## FROM stats WHERE name = "99pcsorted" 65 | ## GROUP BY MULTIPLOT, x ORDER BY MULTIPLOT, x 66 | plot \ 67 | 'speed_png-data.txt' index 6 title "ssssort" with linespoints, \ 68 | 'speed_png-data.txt' index 7 title "std::sort" with linespoints 69 | 70 | 71 | set output "99.9pcsorted.png" 72 | set title 'Super Scalar Sample Sort Test: 99.9% Sorted' 73 | ## MULTIPLOT(algo) SELECT LOG(2, size) AS x, 74 | ## AVG(time) / (size * log(2, size)) * 1e6 AS y, 75 | ## MULTIPLOT 76 | ## FROM stats WHERE name = "99.9pcsorted" 77 | ## GROUP BY MULTIPLOT, x ORDER BY MULTIPLOT, x 78 | plot \ 79 | 'speed_png-data.txt' index 8 title "ssssort" with linespoints, \ 80 | 'speed_png-data.txt' index 9 title "std::sort" with linespoints 81 | 82 | 83 | set output "90pctail.png" 84 | set title 'Super Scalar Sample Sort Test: 90% Sorted + 10% Random Tail' 85 | ## MULTIPLOT(algo) SELECT LOG(2, size) AS x, 86 | ## AVG(time) / (size * log(2, size)) * 1e6 AS y, 87 | ## MULTIPLOT 88 | ## FROM stats WHERE name = "tail90" 89 | ## GROUP BY MULTIPLOT, x ORDER BY MULTIPLOT, x 90 | plot \ 91 | 'speed_png-data.txt' index 10 title "ssssort" with linespoints, \ 92 | 'speed_png-data.txt' index 11 title "std::sort" with linespoints 93 | 94 | 95 | set output "99pctail.png" 96 | set title 'Super Scalar Sample Sort Test: 99% Sorted + 1% Random Tail' 97 | ## MULTIPLOT(algo) SELECT LOG(2, size) AS x, 98 | ## AVG(time) / (size * log(2, size)) * 1e6 AS y, 99 | ## MULTIPLOT 100 | ## FROM stats WHERE name = "tail99" 101 | ## GROUP BY MULTIPLOT, x ORDER BY MULTIPLOT, x 102 | plot \ 103 | 'speed_png-data.txt' index 12 title "ssssort" with linespoints, \ 104 | 'speed_png-data.txt' index 13 title "std::sort" with linespoints 105 | 106 | 107 | set output "sorted.png" 108 | set title 'Super Scalar Sample Sort Test: Sorted' 109 | ## MULTIPLOT(algo) SELECT LOG(2, size) AS x, 110 | ## AVG(time) / (size * log(2, size)) * 1e6 AS y, 111 | ## MULTIPLOT 112 | ## FROM stats WHERE name = "sorted" 113 | ## GROUP BY MULTIPLOT, x ORDER BY MULTIPLOT, x 114 | plot \ 115 | 'speed_png-data.txt' index 14 title "ssssort" with linespoints, \ 116 | 'speed_png-data.txt' index 15 title "std::sort" with linespoints 117 | 118 | 119 | 120 | set output "reverse.png" 121 | set title 'Super Scalar Sample Sort Test: Reverse Sorted' 122 | ## MULTIPLOT(algo) SELECT LOG(2, size) AS x, 123 | ## AVG(time) / (size * log(2, size)) * 1e6 AS y, 124 | ## MULTIPLOT 125 | ## FROM stats WHERE name = "reverse" 126 | ## GROUP BY MULTIPLOT, x ORDER BY MULTIPLOT, x 127 | plot \ 128 | 'speed_png-data.txt' index 16 title "ssssort" with linespoints, \ 129 | 'speed_png-data.txt' index 17 title "std::sort" with linespoints 130 | 131 | set output "many-dupes.png" 132 | set title 'Super Scalar Sample Sort Test: Many duplicates (A[i]=i^{16} mod floor(log_2 n)' 133 | ## MULTIPLOT(algo) SELECT LOG(2, size) AS x, 134 | ## AVG(time) / (size * log(2, size)) * 1e6 AS y, 135 | ## MULTIPLOT 136 | ## FROM stats WHERE name = "many-dupes" 137 | ## GROUP BY MULTIPLOT, x ORDER BY MULTIPLOT, x 138 | plot \ 139 | 'speed_png-data.txt' index 18 title "ssssort" with linespoints, \ 140 | 'speed_png-data.txt' index 19 title "std::sort" with linespoints 141 | 142 | 143 | set output "few-spikes-with-noise.png" 144 | set title 'Super Scalar Sample Sort Test: Few spikes, lots of noise' 145 | ## MULTIPLOT(algo) SELECT LOG(2, size) AS x, 146 | ## AVG(time) / (size * log(2, size)) * 1e6 AS y, 147 | ## MULTIPLOT 148 | ## FROM stats WHERE name = "few-spikes-with-noise" 149 | ## GROUP BY MULTIPLOT, x ORDER BY MULTIPLOT, x 150 | plot \ 151 | 'speed_png-data.txt' index 20 title "ssssort" with linespoints, \ 152 | 'speed_png-data.txt' index 21 title "std::sort" with linespoints 153 | 154 | 155 | set output "ones.png" 156 | set title 'Super Scalar Sample Sort Test: All Ones' 157 | ## MULTIPLOT(algo) SELECT LOG(2, size) AS x, 158 | ## AVG(time) / (size * log(2, size)) * 1e6 AS y, 159 | ## MULTIPLOT 160 | ## FROM stats WHERE name = "ones" 161 | ## GROUP BY MULTIPLOT, x ORDER BY MULTIPLOT, x 162 | plot \ 163 | 'speed_png-data.txt' index 22 title "ssssort" with linespoints, \ 164 | 'speed_png-data.txt' index 23 title "std::sort" with linespoints 165 | -------------------------------------------------------------------------------- /sort.cpp: -------------------------------------------------------------------------------- 1 | /******************************************************************************* 2 | * sort.cpp 3 | * 4 | * Test runner 5 | * 6 | ******************************************************************************* 7 | * Copyright (C) 2016 Lorenz Hübschle-Schneider 8 | * 9 | * The MIT License (MIT) 10 | * 11 | * Permission is hereby granted, free of charge, to any person obtaining a copy 12 | * of this software and associated documentation files (the "Software"), to deal 13 | * in the Software without restriction, including without limitation the rights 14 | * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 15 | * copies of the Software, and to permit persons to whom the Software is 16 | * furnished to do so, subject to the following conditions: 17 | * 18 | * The above copyright notice and this permission notice shall be included in 19 | * all copies or substantial portions of the Software. 20 | 21 | * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 22 | * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 23 | * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 24 | * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 25 | * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 26 | * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 27 | * SOFTWARE. 28 | ******************************************************************************/ 29 | 30 | 31 | #include 32 | #include 33 | #include 34 | #include 35 | #include 36 | #include 37 | 38 | #include "benchmark.h" 39 | 40 | // Change this to some other integral type to test other data types 41 | using data_t = int; 42 | 43 | int main(int argc, char *argv[]) { 44 | if (argc > 1 && std::string{argv[1]} == "-h") { 45 | std::cout << "Usage: " << argv[0] 46 | << " [outer iteratons] [inner iterations]" 47 | << " [statistics output file]" << std::endl 48 | << "Defaults are 5 outer iteration, 3 inner iterations," 49 | << " and output to stats.txt" << std::endl; 50 | return 0; 51 | } 52 | 53 | std::cout << "This benchmark suite writes output for SqlPlotTools to allow " 54 | << "for easy plotting." << std::endl << "Grab a copy at " 55 | << "https://github.com/bingmann/sqlplot-tools, point it to " 56 | << "speed.plot and run gnuplot on it!" << std::endl; 57 | 58 | // Parse flags 59 | size_t outer_its = 5, inner_its = 3; 60 | if (argc > 1) outer_its = static_cast(atol(argv[1])); 61 | if (argc > 2) inner_its = static_cast(atol(argv[2])); 62 | 63 | std::string stat_file = "stats.txt"; 64 | if (argc > 3) stat_file = std::string{argv[3]}; 65 | std::ofstream *stat_stream = nullptr; 66 | if (stat_file != "-") { 67 | stat_stream = new std::ofstream; 68 | stat_stream->open(stat_file); 69 | } 70 | 71 | auto random_gen = [](data_t* data, size_t size){ 72 | std::mt19937 rng{ std::random_device{}() }; 73 | for (size_t i = 0; i < size; ++i) { 74 | data[i] = static_cast(rng()); 75 | } 76 | }; 77 | 78 | // Warmup 79 | benchmark_generator(random_gen, "warmup", 1, 3, stat_stream, 20); 80 | 81 | 82 | // Run Benchmarks 83 | benchmark_generator(random_gen, "random", outer_its, inner_its, 84 | stat_stream); 85 | 86 | 87 | // nearly sorted data generator factory 88 | auto nearly_sorted_gen = [](size_t rfrac) { 89 | return [rfrac](data_t* data, size_t size) { 90 | std::mt19937 rng{ std::random_device{}() }; 91 | // fill with sorted data, using entire range of RNG 92 | size_t factor = static_cast(static_cast(rng.max()) / size); 93 | for (size_t i = 0; i < size; ++i) { 94 | data[i] = static_cast(i * factor); 95 | } 96 | // set 1/rfrac of the items to random values 97 | for (size_t i = 0; i < size/rfrac; ++i) { 98 | data[rng() % size] = static_cast(rng()); 99 | } 100 | }; 101 | }; 102 | 103 | benchmark_generator(nearly_sorted_gen(5), "80pcsorted", 104 | outer_its, inner_its, stat_stream); 105 | benchmark_generator(nearly_sorted_gen(10), "90pcsorted", 106 | outer_its, inner_its, stat_stream); 107 | benchmark_generator(nearly_sorted_gen(100), "99pcsorted", 108 | outer_its, inner_its, stat_stream); 109 | benchmark_generator(nearly_sorted_gen(1000), "99.9pcsorted", 110 | outer_its, inner_its, stat_stream); 111 | 112 | 113 | // nearly sorted data generator factory 114 | auto unsorted_tail_gen = [](size_t rfrac) { 115 | return [rfrac](data_t* data, size_t size) { 116 | std::mt19937 rng{ std::random_device{}() }; 117 | // fill with sorted data, using entire range of RNG 118 | size_t ordered_max = size - (size / rfrac); 119 | size_t factor = static_cast(static_cast(rng.max()) / ordered_max); 120 | for (size_t i = 0; i < ordered_max; ++i) { 121 | data[i] = static_cast(i * factor); 122 | } 123 | // set 1/rfrac of the items to random values 124 | for (size_t i = ordered_max; i < size; ++i) { 125 | data[i] = static_cast(rng()); 126 | } 127 | }; 128 | }; 129 | 130 | benchmark_generator(unsorted_tail_gen(10), "tail90", 131 | outer_its, inner_its, stat_stream); 132 | benchmark_generator(unsorted_tail_gen(100), "tail99", 133 | outer_its, inner_its, stat_stream); 134 | 135 | 136 | benchmark_generator([](data_t* data, size_t size){ 137 | for (size_t i = 0; i < size; ++i) { 138 | data[i] = static_cast(i); 139 | } 140 | }, "sorted", outer_its, inner_its, stat_stream, true); 141 | 142 | 143 | benchmark_generator([](data_t* data, size_t size){ 144 | for (size_t i = 0; i < size; ++i) { 145 | data[i] = static_cast(size - i); 146 | } 147 | }, "reverse", outer_its, inner_its, stat_stream, true); 148 | 149 | 150 | // Benchmark due to Armin Weiß at Universität Stuttgart 151 | benchmark_generator([](data_t* data, size_t size) { 152 | size_t flogn = 0, s = size; 153 | while (s >>= 1) ++flogn; // floor(log2(n)) 154 | 155 | for (size_t i = 0; i < size; ++i) { 156 | size_t j = i; 157 | j *= j; j *= j; j *= j; j *= j; 158 | data[i] = static_cast(j % flogn); 159 | } 160 | }, "many-dupes", outer_its, inner_its, stat_stream, true); 161 | 162 | 163 | /* Benchmark due to Armin Weiß at Universität Stuttgart 164 | * 165 | * This is an interesting case because the distribution has few very large 166 | * spikes and lots of elements around them. Thus the buckets aren't 167 | * all-equal, and without a break on big buckets, it would recurse a lot. 168 | */ 169 | benchmark_generator([](data_t* data, size_t size){ 170 | uint64_t prev_pow_2 = 1; 171 | while (2 * prev_pow_2 <= size) { prev_pow_2 *= 2; } 172 | const size_t offset_zw = prev_pow_2 / 2; 173 | 174 | for (size_t i = 0; i < size; i++) { 175 | uint64_t temp = (i*i) % prev_pow_2; 176 | temp = (temp*temp) % prev_pow_2; 177 | data[i] = static_cast( 178 | (offset_zw + temp*temp) % prev_pow_2); 179 | } 180 | }, "few-spikes-with-noise", outer_its, inner_its, stat_stream, true); 181 | 182 | 183 | benchmark_generator([](data_t* data, size_t size){ 184 | for (size_t i = 0; i < size; ++i) { 185 | data[i] = 1; 186 | } 187 | }, "ones", outer_its, inner_its, stat_stream, true); 188 | 189 | 190 | if (stat_stream != nullptr) { 191 | stat_stream->close(); 192 | delete stat_stream; 193 | } 194 | } 195 | -------------------------------------------------------------------------------- /benchmark.h: -------------------------------------------------------------------------------- 1 | /******************************************************************************* 2 | * benchmark.h 3 | * 4 | * Benchmark utilities 5 | * 6 | ******************************************************************************* 7 | * Copyright (C) 2016 Lorenz Hübschle-Schneider 8 | * 9 | * The MIT License (MIT) 10 | * 11 | * Permission is hereby granted, free of charge, to any person obtaining a copy 12 | * of this software and associated documentation files (the "Software"), to deal 13 | * in the Software without restriction, including without limitation the rights 14 | * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 15 | * copies of the Software, and to permit persons to whom the Software is 16 | * furnished to do so, subject to the following conditions: 17 | * 18 | * The above copyright notice and this permission notice shall be included in 19 | * all copies or substantial portions of the Software. 20 | 21 | * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 22 | * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 23 | * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 24 | * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 25 | * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 26 | * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 27 | * SOFTWARE. 28 | ******************************************************************************/ 29 | 30 | #pragma once 31 | 32 | const bool debug = false; 33 | 34 | #include 35 | #include 36 | #include 37 | #include 38 | #include 39 | #include 40 | #include 41 | 42 | #include "ssssort.h" 43 | #include "timer.h" 44 | #include "progress_bar.h" 45 | 46 | struct statistics { 47 | // Single-pass standard deviation calculation as described in Donald Knuth: 48 | // The Art of Computer Programming, Volume 2, Chapter 4.2.2, Equations 15&16 49 | double mean; 50 | double nvar; // approx n * variance; stddev = sqrt(nvar / (count-1)) 51 | size_t count; 52 | 53 | statistics() : mean(0.0), nvar(0.0), count(0) {} 54 | 55 | void push(double t) { 56 | ++count; 57 | if (count == 1) { 58 | mean = t; 59 | } else { 60 | double oldmean = mean; 61 | mean += (t - oldmean) / count; 62 | nvar += (t - oldmean) * (t - mean); 63 | } 64 | } 65 | 66 | double avg() { 67 | return mean; 68 | } 69 | double stddev() { 70 | assert(count > 1); 71 | return sqrt(nvar / (count - 1)); 72 | } 73 | }; 74 | 75 | template 76 | void run(T* data, const T* const copy, T* out, size_t size, Sorter sorter, 77 | size_t iterations, statistics& stats, progress_bar &bar, 78 | bool reset_out = true) { 79 | // warmup 80 | sorter(data, out, size); 81 | ++bar; 82 | 83 | Timer timer; 84 | for (size_t it = 0; it < iterations; ++it) { 85 | // reset data and timer 86 | std::copy(copy, copy+size, data); 87 | if (reset_out) 88 | memset(out, 0, size * sizeof(T)); 89 | timer.reset(); 90 | 91 | sorter(data, out, size); 92 | 93 | stats.push(timer.get()); 94 | ++bar; 95 | } 96 | } 97 | 98 | template > 99 | size_t benchmark(size_t size, Generator generator, const std::string &name, 100 | size_t outer_its, size_t inner_its, 101 | std::ofstream *stat_stream, bool deterministic_gen = false, 102 | Compare compare = Compare{}) { 103 | T *data = new T[size], 104 | *out = new T[size], 105 | *copy = new T[size]; 106 | 107 | // Number of iterations 108 | if (outer_its == static_cast(-1)) { 109 | if (deterministic_gen) { 110 | // deterministic is boring 111 | outer_its = 1; 112 | if (inner_its == static_cast(-1)) { 113 | if (size < (1<<14)) inner_its = 1000; 114 | else if (size < (1<<16)) inner_its = 500; 115 | else if (size < (1<<18)) inner_its = 250; 116 | else inner_its = 100; 117 | } 118 | } else { 119 | if (size < (1<<16)) outer_its = 100; 120 | else if (size < (1<<18)) outer_its = 50; 121 | else if (size < (1<<22)) outer_its = 35; 122 | else outer_its = 25; 123 | } 124 | } 125 | if (inner_its == static_cast(-1)) { 126 | inner_its = 10; 127 | } 128 | 129 | // the label maker 130 | auto bar_label = [&](size_t it) { 131 | return name + " (" + std::to_string(it + 1) + "/" + 132 | std::to_string(outer_its) + "): "; 133 | }; 134 | 135 | progress_bar bar(2 * outer_its * (inner_its + 1), bar_label(0)); 136 | Timer timer; 137 | 138 | double t_generate(0.0), t_verify(0.0); 139 | bool incorrect = false; 140 | statistics t_ssssort, t_stdsort; 141 | for (size_t it = 0; it < outer_its; ++it) { 142 | bar.set_extra(bar_label(it)); 143 | // Generate random numbers as input 144 | timer.reset(); 145 | size = generator(data, size); 146 | 147 | // create a copy to be able to sort it multiple times 148 | std::copy(data, data+size, copy); 149 | t_generate += timer.get_and_reset(); 150 | 151 | // Sorting algorithms have their own time tracking 152 | // 1. Super Scalar Sample Sort 153 | run(data, copy, out, size, 154 | [compare](T* data, T* out, size_t size) 155 | { ssssort::ssssort(data, data + size, out, compare); }, 156 | inner_its, t_ssssort, bar); 157 | 158 | // 2. std::sort 159 | run(data, copy, out, size, 160 | [compare](T* data, T* /*ignored*/, size_t size) 161 | { std::sort(data, data + size, compare); }, 162 | inner_its, t_stdsort, bar, false); 163 | 164 | 165 | // verify 166 | timer.reset(); 167 | bool it_incorrect = !std::is_sorted(out, out + size, compare); 168 | if (it_incorrect) { 169 | std::cerr << "Output data isn't sorted" << std::endl; 170 | } 171 | for (size_t i = 0; i < size; ++i) { 172 | it_incorrect |= (out[i] != data[i]); 173 | if (debug && out[i] != data[i]) { 174 | std::cerr << "Err at pos " << i << " expected " << data[i] 175 | << " got " << out[i] << std::endl; 176 | } 177 | } 178 | incorrect |= it_incorrect; 179 | t_verify += timer.get_and_reset(); 180 | 181 | } 182 | 183 | bar.undraw(); 184 | 185 | delete[] out; 186 | delete[] data; 187 | delete[] copy; 188 | 189 | std::stringstream output; 190 | output << "RESULT algo=ssssort" 191 | << " name=" << name 192 | << " size=" << size 193 | << " iters=" << outer_its << "*" << inner_its 194 | << " time=" << t_ssssort.avg() 195 | << " stddev=" << t_ssssort.stddev() 196 | << " t_gen=" << t_generate 197 | << " t_check=" << t_verify 198 | << " ok=" << !incorrect 199 | << std::endl 200 | << "RESULT algo=stdsort" 201 | << " name=" << name 202 | << " size=" << size 203 | << " iters=" << outer_its << "*" << inner_its 204 | << " time=" << t_stdsort.avg() 205 | << " stddev=" << t_stdsort.stddev() 206 | << " t_gen=" << t_generate 207 | << " t_check=0" 208 | << " ok=1" 209 | << std::endl; 210 | auto result_str = output.str(); 211 | std::cout << result_str; 212 | if (stat_stream != nullptr) 213 | *stat_stream << result_str << std::flush; 214 | 215 | return size; 216 | } 217 | 218 | template 219 | void benchmark_generator(Generator generator, const std::string &name, 220 | const size_t outer_its, const size_t inner_its, 221 | std::ofstream *stat_stream, 222 | bool deterministic_gen = false, 223 | const size_t max_log_size = 27) { 224 | auto wrapped_generator = [generator](T* data, size_t size) { 225 | generator(data, size); 226 | return size; 227 | }; 228 | 229 | // warmup 230 | benchmark(1<<10, wrapped_generator, "warmup", 1, 10, nullptr); 231 | 232 | for (size_t log_size = 10; log_size < max_log_size; ++log_size) { 233 | size_t size = 1 << log_size; 234 | benchmark(size, wrapped_generator, name, outer_its, inner_its, 235 | stat_stream, deterministic_gen); 236 | } 237 | } 238 | 239 | 240 | template 241 | void sized_benchmark_generator(Generator generator, const std::string &name, 242 | const size_t outer_its, const size_t inner_its, 243 | std::ofstream *stat_stream, 244 | bool deterministic_gen = false, 245 | const size_t max_log_size = 27) { 246 | // warmup 247 | benchmark(1<<10, generator, "warmup", 1, 10, nullptr); 248 | 249 | for (size_t log_size = 10; log_size < max_log_size; ++log_size) { 250 | size_t size = 1 << log_size; 251 | size_t last_size = benchmark( 252 | size, generator, name, outer_its, inner_its, 253 | stat_stream, deterministic_gen); 254 | if (last_size < size) break; 255 | } 256 | } 257 | -------------------------------------------------------------------------------- /speed.plot: -------------------------------------------------------------------------------- 1 | # IMPORT-DATA stats stats.txt 2 | 3 | set terminal pdf size 13.33cm,10cm linewidth 2.0 4 | set output "speed.pdf" 5 | 6 | set style line 11 lc rgb "#333333" lt 1 7 | set border 3 back ls 11 8 | set tics nomirror 9 | # define grid 10 | set style line 12 lc rgb "#333333" lt 0 lw 1 11 | set grid back ls 12 12 | 13 | set grid xtics ytics 14 | 15 | set key top left 16 | 17 | set yrange [0:4] 18 | 19 | set xlabel 'Item Count [log_2(n)]' 20 | set ylabel 'Run Time / n log_2n [Nanoseconds]' 21 | 22 | set style line 1 lt 1 lw .2 23 | set style line 2 lt 1 lw 1 24 | 25 | set style line 3 lt 2 lw .2 26 | set style line 4 lt 2 lw 1 27 | 28 | set style line 5 lt 3 lw .2 29 | set style line 6 lt 3 lw 1 30 | 31 | 32 | #SQL DELETE FROM stats WHERE LOG(2, size) < 12 33 | 34 | # most cross-DB way of doing what "SELECT ..., a FROM ..., (VALUES(-1), (0), 35 | # (1)) as dev (a)" does in postgres 36 | # SQL CREATE TABLE IF NOT EXISTS temp_add (a INTEGER) 37 | # SQL DELETE FROM temp_add 38 | # SQL INSERT INTO temp_add VALUES (-1), (0), (1) 39 | 40 | set title 'Super Scalar Sample Sort Test: Random' 41 | ## MULTIPLOT(algo, a) SELECT LOG(2, size) AS x, 42 | ## AVG(time + a * stddev) / (size * log(2, size)) * 1e6 AS y, 43 | ## MULTIPLOT 44 | ## FROM stats, temp_add WHERE name = 'random' 45 | ## GROUP BY MULTIPLOT, size ORDER BY MULTIPLOT, size 46 | plot \ 47 | 'speed-data.txt' index 0 notitle ls 3 with lines, \ 48 | 'speed-data.txt' index 1 title "ssssort" ls 4 with linespoints, \ 49 | 'speed-data.txt' index 2 notitle ls 3 with lines, \ 50 | 'speed-data.txt' index 3 notitle ls 5 with lines, \ 51 | 'speed-data.txt' index 4 title "std::sort" ls 6 with linespoints, \ 52 | 'speed-data.txt' index 5 notitle ls 5 with lines 53 | 54 | 55 | set title 'Super Scalar Sample Sort Test: 80% Sorted' 56 | ## MULTIPLOT(algo, a) SELECT LOG(2, size) AS x, 57 | ## AVG(time + a * stddev) / (size * log(2, size)) * 1e6 AS y, 58 | ## MULTIPLOT 59 | ## FROM stats, temp_add WHERE name = '80pcsorted' 60 | ## GROUP BY MULTIPLOT, size ORDER BY MULTIPLOT, size 61 | plot \ 62 | 'speed-data.txt' index 6 notitle ls 3 with lines, \ 63 | 'speed-data.txt' index 7 title "ssssort" ls 4 with linespoints, \ 64 | 'speed-data.txt' index 8 notitle ls 3 with lines, \ 65 | 'speed-data.txt' index 9 notitle ls 5 with lines, \ 66 | 'speed-data.txt' index 10 title "std::sort" ls 6 with linespoints, \ 67 | 'speed-data.txt' index 11 notitle ls 5 with lines 68 | 69 | 70 | set title 'Super Scalar Sample Sort Test: 90% Sorted' 71 | ## MULTIPLOT(algo, a) SELECT LOG(2, size) AS x, 72 | ## AVG(time + a * stddev) / (size * log(2, size)) * 1e6 AS y, 73 | ## MULTIPLOT 74 | ## FROM stats, temp_add WHERE name = '90pcsorted' 75 | ## GROUP BY MULTIPLOT, size ORDER BY MULTIPLOT, x 76 | plot \ 77 | 'speed-data.txt' index 12 notitle ls 3 with lines, \ 78 | 'speed-data.txt' index 13 title "ssssort" ls 4 with linespoints, \ 79 | 'speed-data.txt' index 14 notitle ls 3 with lines, \ 80 | 'speed-data.txt' index 15 notitle ls 5 with lines, \ 81 | 'speed-data.txt' index 16 title "std::sort" ls 6 with linespoints, \ 82 | 'speed-data.txt' index 17 notitle ls 5 with lines 83 | 84 | 85 | set title 'Super Scalar Sample Sort Test: 99% Sorted' 86 | ## MULTIPLOT(algo, a) SELECT LOG(2, size) AS x, 87 | ## AVG(time + a * stddev) / (size * log(2, size)) * 1e6 AS y, 88 | ## MULTIPLOT 89 | ## FROM stats, temp_add WHERE name = '99pcsorted' 90 | ## GROUP BY MULTIPLOT, size ORDER BY MULTIPLOT, x 91 | plot \ 92 | 'speed-data.txt' index 18 notitle ls 3 with lines, \ 93 | 'speed-data.txt' index 19 title "ssssort" ls 4 with linespoints, \ 94 | 'speed-data.txt' index 20 notitle ls 3 with lines, \ 95 | 'speed-data.txt' index 21 notitle ls 5 with lines, \ 96 | 'speed-data.txt' index 22 title "std::sort" ls 6 with linespoints, \ 97 | 'speed-data.txt' index 23 notitle ls 5 with lines 98 | 99 | 100 | set title 'Super Scalar Sample Sort Test: 99.9% Sorted' 101 | ## MULTIPLOT(algo, a) SELECT LOG(2, size) AS x, 102 | ## AVG(time + a * stddev) / (size * log(2, size)) * 1e6 AS y, 103 | ## MULTIPLOT 104 | ## FROM stats, temp_add WHERE name = '99.9pcsorted' 105 | ## GROUP BY MULTIPLOT, size ORDER BY MULTIPLOT, x 106 | plot \ 107 | 'speed-data.txt' index 24 notitle ls 3 with lines, \ 108 | 'speed-data.txt' index 25 title "ssssort" ls 4 with linespoints, \ 109 | 'speed-data.txt' index 26 notitle ls 3 with lines, \ 110 | 'speed-data.txt' index 27 notitle ls 5 with lines, \ 111 | 'speed-data.txt' index 28 title "std::sort" ls 6 with linespoints, \ 112 | 'speed-data.txt' index 29 notitle ls 5 with lines 113 | 114 | 115 | set title 'Super Scalar Sample Sort Test: 90% Sorted + 10% Random Tail' 116 | ## MULTIPLOT(algo, a) SELECT LOG(2, size) AS x, 117 | ## AVG(time + a * stddev) / (size * log(2, size)) * 1e6 AS y, 118 | ## MULTIPLOT 119 | ## FROM stats, temp_add WHERE name = 'tail90' 120 | ## GROUP BY MULTIPLOT, size ORDER BY MULTIPLOT, x 121 | plot \ 122 | 'speed-data.txt' index 30 notitle ls 3 with lines, \ 123 | 'speed-data.txt' index 31 title "ssssort" ls 4 with linespoints, \ 124 | 'speed-data.txt' index 32 notitle ls 3 with lines, \ 125 | 'speed-data.txt' index 33 notitle ls 5 with lines, \ 126 | 'speed-data.txt' index 34 title "std::sort" ls 6 with linespoints, \ 127 | 'speed-data.txt' index 35 notitle ls 5 with lines 128 | 129 | 130 | set title 'Super Scalar Sample Sort Test: 99% Sorted + 1% Random Tail' 131 | ## MULTIPLOT(algo, a) SELECT LOG(2, size) AS x, 132 | ## AVG(time + a * stddev) / (size * log(2, size)) * 1e6 AS y, 133 | ## MULTIPLOT 134 | ## FROM stats, temp_add WHERE name = 'tail99' 135 | ## GROUP BY MULTIPLOT, size ORDER BY MULTIPLOT, x 136 | plot \ 137 | 'speed-data.txt' index 36 notitle ls 3 with lines, \ 138 | 'speed-data.txt' index 37 title "ssssort" ls 4 with linespoints, \ 139 | 'speed-data.txt' index 38 notitle ls 3 with lines, \ 140 | 'speed-data.txt' index 39 notitle ls 5 with lines, \ 141 | 'speed-data.txt' index 40 title "std::sort" ls 6 with linespoints, \ 142 | 'speed-data.txt' index 41 notitle ls 5 with lines 143 | 144 | 145 | set title 'Super Scalar Sample Sort Test: Sorted' 146 | ## MULTIPLOT(algo, a) SELECT LOG(2, size) AS x, 147 | ## AVG(time + a * stddev) / (size * log(2, size)) * 1e6 AS y, 148 | ## MULTIPLOT 149 | ## FROM stats, temp_add WHERE name = 'sorted' 150 | ## GROUP BY MULTIPLOT, size ORDER BY MULTIPLOT, x 151 | plot \ 152 | 'speed-data.txt' index 42 notitle ls 3 with lines, \ 153 | 'speed-data.txt' index 43 title "ssssort" ls 4 with linespoints, \ 154 | 'speed-data.txt' index 44 notitle ls 3 with lines, \ 155 | 'speed-data.txt' index 45 notitle ls 5 with lines, \ 156 | 'speed-data.txt' index 46 title "std::sort" ls 6 with linespoints, \ 157 | 'speed-data.txt' index 47 notitle ls 5 with lines 158 | 159 | 160 | 161 | set title 'Super Scalar Sample Sort Test: Reverse Sorted' 162 | ## MULTIPLOT(algo, a) SELECT LOG(2, size) AS x, 163 | ## AVG(time + a * stddev) / (size * log(2, size)) * 1e6 AS y, 164 | ## MULTIPLOT 165 | ## FROM stats, temp_add WHERE name = 'reverse' 166 | ## GROUP BY MULTIPLOT, size ORDER BY MULTIPLOT, x 167 | plot \ 168 | 'speed-data.txt' index 48 notitle ls 3 with lines, \ 169 | 'speed-data.txt' index 49 title "ssssort" ls 4 with linespoints, \ 170 | 'speed-data.txt' index 50 notitle ls 3 with lines, \ 171 | 'speed-data.txt' index 51 notitle ls 5 with lines, \ 172 | 'speed-data.txt' index 52 title "std::sort" ls 6 with linespoints, \ 173 | 'speed-data.txt' index 53 notitle ls 5 with lines 174 | 175 | 176 | set title 'Super Scalar Sample Sort Test: Many duplicates (A[i]=i^{16} mod floor(log_2 n)' 177 | ## MULTIPLOT(algo, a) SELECT LOG(2, size) AS x, 178 | ## AVG(time + a * stddev) / (size * log(2, size)) * 1e6 AS y, 179 | ## MULTIPLOT 180 | ## FROM stats, temp_add WHERE name = 'many-dupes' 181 | ## GROUP BY MULTIPLOT, size ORDER BY MULTIPLOT, x 182 | plot \ 183 | 'speed-data.txt' index 54 notitle ls 3 with lines, \ 184 | 'speed-data.txt' index 55 title "ssssort" ls 4 with linespoints, \ 185 | 'speed-data.txt' index 56 notitle ls 3 with lines, \ 186 | 'speed-data.txt' index 57 notitle ls 5 with lines, \ 187 | 'speed-data.txt' index 58 title "std::sort" ls 6 with linespoints, \ 188 | 'speed-data.txt' index 59 notitle ls 5 with lines 189 | 190 | 191 | set title 'Super Scalar Sample Sort Test: Few spikes, lots of noise' 192 | ## MULTIPLOT(algo, a) SELECT LOG(2, size) AS x, 193 | ## AVG(time + a * stddev) / (size * log(2, size)) * 1e6 AS y, 194 | ## MULTIPLOT 195 | ## FROM stats, temp_add WHERE name = 'few-spikes-with-noise' 196 | ## GROUP BY MULTIPLOT, size ORDER BY MULTIPLOT, x 197 | plot \ 198 | 'speed-data.txt' index 60 notitle ls 3 with lines, \ 199 | 'speed-data.txt' index 61 title "ssssort" ls 4 with linespoints, \ 200 | 'speed-data.txt' index 62 notitle ls 3 with lines, \ 201 | 'speed-data.txt' index 63 notitle ls 5 with lines, \ 202 | 'speed-data.txt' index 64 title "std::sort" ls 6 with linespoints, \ 203 | 'speed-data.txt' index 65 notitle ls 5 with lines 204 | 205 | 206 | set title 'Super Scalar Sample Sort Test: All Ones' 207 | ## MULTIPLOT(algo, a) SELECT LOG(2, size) AS x, 208 | ## AVG(time + a * stddev) / (size * log(2, size)) * 1e6 AS y, 209 | ## MULTIPLOT 210 | ## FROM stats, temp_add WHERE name = 'ones' 211 | ## GROUP BY MULTIPLOT, size ORDER BY MULTIPLOT, x 212 | plot \ 213 | 'speed-data.txt' index 66 notitle ls 3 with lines, \ 214 | 'speed-data.txt' index 67 title "ssssort" ls 4 with linespoints, \ 215 | 'speed-data.txt' index 68 notitle ls 3 with lines, \ 216 | 'speed-data.txt' index 69 notitle ls 5 with lines, \ 217 | 'speed-data.txt' index 70 title "std::sort" ls 6 with linespoints, \ 218 | 'speed-data.txt' index 71 notitle ls 5 with lines 219 | 220 | 221 | # SQL DROP TABLE temp_add 222 | -------------------------------------------------------------------------------- /ssssort.h: -------------------------------------------------------------------------------- 1 | /******************************************************************************* 2 | * ssssort.h 3 | * 4 | * Super Scalar Sample Sort 5 | * 6 | ******************************************************************************* 7 | * Copyright (C) 2014 Timo Bingmann 8 | * Copyright (C) 2016 Lorenz Hübschle-Schneider 9 | * Copyright (C) 2016 Morwenn 10 | * 11 | * The MIT License (MIT) 12 | * 13 | * Permission is hereby granted, free of charge, to any person obtaining a copy 14 | * of this software and associated documentation files (the "Software"), to deal 15 | * in the Software without restriction, including without limitation the rights 16 | * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 17 | * copies of the Software, and to permit persons to whom the Software is 18 | * furnished to do so, subject to the following conditions: 19 | * 20 | * The above copyright notice and this permission notice shall be included in 21 | * all copies or substantial portions of the Software. 22 | 23 | * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 24 | * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 25 | * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 26 | * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 27 | * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 28 | * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 29 | * SOFTWARE. 30 | ******************************************************************************/ 31 | 32 | #pragma once 33 | 34 | #include 35 | #include 36 | #include 37 | #include 38 | #include 39 | #include 40 | #include 41 | #include 42 | #include 43 | #include 44 | 45 | // Compiler hints about invariants, inspired by ICC's __assume() 46 | #define __assume(cond) ({ if (!(cond)) __builtin_unreachable(); }) 47 | 48 | // C++11 compatibility 49 | #if __cplusplus < 201402L 50 | // 51 | namespace std { 52 | // std::make_unique for arrays of unknown bound 53 | template 54 | inline unique_ptr make_unique(size_t size) { 55 | using scalar_t = typename remove_extent::type; 56 | return unique_ptr(new scalar_t[size]()); 57 | } 58 | 59 | template <> 60 | struct less { 61 | template 62 | bool operator()(const T& x, const T& y) const { 63 | return x < y; 64 | } 65 | }; 66 | 67 | template 68 | using result_of_t = typename result_of::type; 69 | 70 | template 71 | using enable_if_t = typename enable_if::type; 72 | } 73 | #endif 74 | 75 | 76 | namespace ssssort { 77 | 78 | /** 79 | * Bucket or input size below which to fall back to the base case sorter 80 | * (std::sort) 81 | */ 82 | constexpr std::size_t basecase_size = 1024; 83 | 84 | /** 85 | * logBuckets determines how many splitters are used. Sample Sort partitions 86 | * the data into buckets, whose number is typically a power of two. Thus, we 87 | * specify its base-2 logarithms. For the partitioning into k buckets, we then 88 | * need k-1 splitters. logBuckets is a tuning parameter, typically 7 or 8. 89 | */ 90 | constexpr std::size_t logBuckets = 8; 91 | constexpr std::size_t numBuckets = 1 << logBuckets; 92 | 93 | /** 94 | * Type to be used for bucket indices. In this case, a uint32_t is overkill, 95 | * but turned out to be fastest. 16-bit arithmetic is peculiarly slow on recent 96 | * Intel CPUs. Needs to fit 2*numBuckets-1 (for the step() function), so 97 | * uint8_t would work for logBuckets = 7 98 | */ 99 | using bucket_t = std::uint32_t; 100 | 101 | 102 | // Random number generation engine for sampling. Declared out-of-class for 103 | // simplicity. You can swap this out for std::minstd_rand if the Mersenne 104 | // Twister is too slow on your hardware. It's only minimally slower on mine 105 | // (Haswell i7-4790T). 106 | #ifdef __MINGW32__ 107 | thread_local std::mt19937 gen(std::time(nullptr)); 108 | #else 109 | thread_local std::mt19937 gen{std::random_device{}()}; 110 | #endif 111 | 112 | // Provides different sampling strategies to choose splitters 113 | template 114 | struct Sampler { 115 | using value_type = typename std::iterator_traits::value_type; 116 | 117 | // Draw a random sample without replacement using the Fisher-Yates Shuffle. 118 | // This reorders the input somewhat but the sorting does that anyway. 119 | static void draw_sample_fisheryates(Iterator begin, Iterator end, 120 | value_type* samples, std::size_t sample_size) 121 | { 122 | // Random generator 123 | assert(begin <= end); 124 | std::size_t max = static_cast(end - begin); 125 | assert(gen.max() >= max); 126 | 127 | for (std::size_t i = 0; i < sample_size; ++i) { 128 | std::size_t index = gen() % max--; // biased, don't care 129 | std::swap(*(begin + index), *(begin + max)); 130 | samples[i] = *(begin + max); 131 | } 132 | } 133 | 134 | 135 | // Draw a random sample with replacement by generating random indices. On my 136 | // machine this results in measurably slower sorting than a 137 | // Fisher-Yates-based sample, so beware the apparent simplicity. 138 | static void draw_sample_simplerand(Iterator begin, Iterator end, 139 | value_type* samples, std::size_t sample_size) 140 | { 141 | // Random generator 142 | assert(begin <= end); 143 | const std::size_t size = static_cast(end - begin); 144 | assert(gen.max() >= size); 145 | 146 | for (std::size_t i = 0; i < sample_size; ++i) { 147 | std::size_t index = gen() % size; // biased, don't care 148 | samples[i] = *(begin + index); 149 | } 150 | } 151 | 152 | 153 | // A completely non-random sample that's beyond terrible on sorted inputs 154 | static void draw_sample_first(Iterator begin, Iterator /* end */, 155 | value_type *samples, std::size_t sample_size) { 156 | for (std::size_t i = 0; i < sample_size; ++i) { 157 | samples[i] = *(begin + i); 158 | } 159 | } 160 | 161 | static void draw_sample(Iterator begin, Iterator end, 162 | value_type *samples, std::size_t sample_size) 163 | { 164 | draw_sample_fisheryates(begin, end, samples, sample_size); 165 | } 166 | 167 | }; 168 | 169 | /** 170 | * Classify elements into buckets. Template parameter treebits specifies the 171 | * log2 of the number of buckets (= 1 << treebits). 172 | */ 173 | template 175 | struct Classifier { 176 | using value_type = typename std::iterator_traits::value_type; 177 | 178 | const std::size_t num_splitters = (1 << treebits) - 1; 179 | const std::size_t splitters_size = 1 << treebits; 180 | value_type splitters[1 << treebits]; 181 | 182 | /// maps items to buckets 183 | bucket_t* const bktout; 184 | /// counts bucket sizes 185 | std::unique_ptr bktsize; 186 | 187 | /** 188 | * Constructs the splitter tree from the given samples 189 | */ 190 | Classifier(const value_type *samples, const std::size_t sample_size, 191 | bucket_t* const bktout) 192 | : bktout(bktout) 193 | , bktsize(std::make_unique(1 << treebits)) 194 | { 195 | std::fill(bktsize.get(), bktsize.get() + (1 << treebits), 0); 196 | build_recursive(samples, samples + sample_size, 1); 197 | } 198 | 199 | /// recursively builds splitter tree. Used by constructor. 200 | void build_recursive(const value_type* lo, const value_type* hi, std::size_t pos) { 201 | __assume(hi >= lo); 202 | const value_type *mid = lo + (hi - lo)/2; 203 | splitters[pos] = *mid; 204 | 205 | if (2 * pos < num_splitters) { 206 | build_recursive(lo, mid, 2*pos); 207 | build_recursive(mid + 1, hi , 2*pos + 1); 208 | } 209 | } 210 | 211 | /* 212 | * What follows is an ugly SFINAE switch. Theoretically, the first case 213 | * suffices for all types - there's no reason why passing const int& should 214 | * be slower than int. However, g++ emits weird code when `key` is passed 215 | * by reference, so force it to dereference `key` for the call. It generates 216 | * the same code as when using operator>(key, splitters[i]) instead of 217 | * compare(splitters[i], key), which it doesn't if `key` is a reference. 218 | */ 219 | /// Push an element down the tree one step. Inlined. 220 | template ::value>* = nullptr> 222 | constexpr bucket_t step(bucket_t i, const T &key, Compare compare) const { 223 | __assume(i > 0); 224 | return 2*i + compare(splitters[i], key); 225 | } 226 | 227 | template ::value>* = nullptr> 229 | constexpr bucket_t step(bucket_t i, const T key, Compare compare) const { 230 | __assume(i > 0); 231 | return 2*i + compare(splitters[i], key); 232 | } 233 | 234 | /// Find the bucket for a single element 235 | constexpr bucket_t find_bucket(const value_type &key, Compare compare) const { 236 | bucket_t i = 1; 237 | while (i <= num_splitters) i = step(i, key, compare); 238 | return (i - static_cast(splitters_size)); 239 | } 240 | 241 | /** 242 | * Find the bucket for U elements at the same time. This version will be 243 | * unrolled by the compiler. Degree of unrolling is a template parameter, 4 244 | * is a good choice usually. 245 | */ 246 | template 247 | inline void find_bucket_unroll(InputIterator key, bucket_t* __restrict__ obkt, Compare compare) 248 | { 249 | bucket_t i[U]; 250 | for (int u = 0; u < U; ++u) i[u] = 1; 251 | 252 | for (std::size_t l = 0; l < treebits; ++l) { 253 | // step on all U keys 254 | for (int u = 0; u < U; ++u) i[u] = step(i[u], *(key + u), compare); 255 | } 256 | for (int u = 0; u < U; ++u) { 257 | i[u] -= splitters_size; 258 | obkt[u] = i[u]; 259 | bktsize[i[u]]++; 260 | } 261 | } 262 | 263 | /// classify all elements by pushing them down the tree and saving bucket id 264 | inline void classify(InputIterator begin, InputIterator end, Compare compare, 265 | bucket_t* __restrict__ bktout = nullptr) { 266 | if (bktout == nullptr) bktout = this->bktout; 267 | for (InputIterator it = begin; it != end;) { 268 | bucket_t bucket = find_bucket(*it++, compare); 269 | *bktout++ = bucket; 270 | bktsize[bucket]++; 271 | } 272 | } 273 | 274 | /// Classify all elements with unrolled bucket finding implementation 275 | template 276 | void classify_unroll(InputIterator begin, InputIterator end, Compare compare) { 277 | bucket_t* bktout = this->bktout; 278 | InputIterator it = begin; 279 | for (; it + U < end; it += U, bktout += U) { 280 | find_bucket_unroll(it, bktout, compare); 281 | } 282 | // process remainder 283 | __assume(end-it <= U); 284 | classify(it, end, compare, bktout); 285 | } 286 | 287 | /** 288 | * Distribute the elements in [in_begin, in_end) into consecutive buckets, 289 | * storage for which begins at out_begin. Need to class classify or 290 | * classify_unroll before to fill the bktout and bktsize arrays. 291 | */ 292 | template 293 | void distribute(InputIterator in_begin, InputIterator in_end, 294 | OutputIterator out_begin) 295 | { 296 | assert(in_begin <= in_end); 297 | // exclusive prefix sum 298 | for (std::size_t i = 0, sum = 0; i < numBuckets; ++i) { 299 | bktsize_t curr_size = bktsize[i]; 300 | bktsize[i] = sum; 301 | sum += curr_size; 302 | } 303 | const std::size_t n = static_cast(in_end - in_begin); 304 | std::size_t i; 305 | for (i = 0; i + U < n; i += U) { 306 | for (std::size_t u = 0; u < U; ++u) { 307 | *(out_begin + bktsize[bktout[i+u]]++) = std::move(*(in_begin + i + u)); 308 | } 309 | } 310 | // process the rest 311 | __assume(n-i <= U); 312 | for (; i < n; ++i) { 313 | *(out_begin + bktsize[bktout[i]]++) = std::move(*(in_begin + i)); 314 | } 315 | } 316 | 317 | }; 318 | 319 | 320 | // Factor to multiply number of buckets by to obtain the number of samples drawn 321 | inline std::size_t oversampling_factor(std::size_t n) { 322 | double r = std::sqrt(double(n)/(2*numBuckets*(logBuckets+4))); 323 | return std::max(r, 1); 324 | } 325 | 326 | 327 | /** 328 | * Stupid wrapper to prevent libstdc++ from wrapping the compare object (because 329 | * it internally uses a compare function on iterators) 330 | */ 331 | template 332 | void stl_sort(Iterator begin, Iterator end, Compare compare) { 333 | std::sort(begin, end, compare); 334 | } 335 | 336 | template 337 | void stl_sort(Iterator begin, Iterator end, std::less) { 338 | std::sort(begin, end); 339 | } 340 | 341 | 342 | /** 343 | * Internal sorter (argument list isn't all that pretty). 344 | * 345 | * begin_is_home indicates whether the output should be stored in the range 346 | * given by begin and end (=true) or out_begin and out_begin + (end - begin) 347 | * (=false). 348 | * 349 | * It is assumed that the range out_begin to out_begin + (end - begin) is valid. 350 | */ 351 | template 352 | void ssssort_int(InputIterator begin, InputIterator end, 353 | OutputIterator out_begin, Compare compare, 354 | bucket_t* __restrict__ bktout, bool begin_is_home) { 355 | using value_type = typename std::iterator_traits::value_type; 356 | 357 | assert(begin <= end); 358 | const std::size_t n = static_cast(end - begin); 359 | 360 | // draw and sort sample 361 | const std::size_t sample_size = oversampling_factor(n) * numBuckets; 362 | auto samples = std::make_unique(sample_size); 363 | Sampler::draw_sample(begin, end, samples.get(), sample_size); 364 | stl_sort(samples.get(), samples.get() + sample_size, compare); 365 | 366 | if (samples[0] == samples[sample_size - 1]) { 367 | // All samples are equal. Clean up and fall back to std::sort 368 | samples.reset(nullptr); 369 | stl_sort(begin, end, compare); 370 | if (!begin_is_home) { 371 | std::move(begin, end, out_begin); 372 | } 373 | return; 374 | } 375 | 376 | // classify elements 377 | Classifier 378 | classifier(samples.get(), sample_size, bktout); 379 | samples.reset(nullptr); 380 | classifier.template classify_unroll<6>(begin, end, compare); 381 | classifier.template distribute<4>(begin, end, out_begin); 382 | 383 | // Recursive calls. offset is the offset into the arrays (/iterators) for 384 | // the current bucket. 385 | std::size_t offset = 0; 386 | for (std::size_t i = 0; i < numBuckets; ++i) { 387 | auto size = classifier.bktsize[i] - offset; 388 | if (size == 0) continue; // empty bucket 389 | if (size <= basecase_size || (n / size) < 2) { 390 | // Either it's a small bucket, or very large (more than half of all 391 | // elements). In either case, we fall back to std::sort. The reason 392 | // we're falling back to std::sort in the second case is that the 393 | // partitioning into buckets is obviously not working (likely 394 | // because a single value made up the majority of the items in the 395 | // previous recursion level, but it's also surrounded by lots of 396 | // other infrequent elements, passing the "all-samples-equal" test. 397 | stl_sort(out_begin + offset, out_begin + classifier.bktsize[i], compare); 398 | if (begin_is_home) { 399 | // uneven recursion level, we have to move the result 400 | std::move(out_begin + offset, 401 | out_begin + classifier.bktsize[i], 402 | begin + offset); 403 | } 404 | } else { 405 | // large bucket, apply sample sort recursively 406 | ssssort_int( 407 | out_begin + offset, 408 | out_begin + classifier.bktsize[i], // = out_begin + offset + size 409 | begin + offset, 410 | compare, 411 | bktout + offset, 412 | !begin_is_home); 413 | } 414 | offset += size; 415 | } 416 | } 417 | 418 | /** 419 | * Sort [begin, end), output is stored in [out_begin, out_begin + (end-begin)) 420 | * 421 | * The elements in [begin, end) will be permuted after calling this. 422 | * Uses <= 2*(end-begin)*sizeof(value_type) bytes of additional memory. 423 | */ 424 | template > 426 | void ssssort(InputIterator begin, InputIterator end, OutputIterator out_begin, Compare compare = {}) { 427 | using value_type = typename std::iterator_traits::value_type; 428 | static_assert(std::is_convertible>::value, 429 | "the result of the predicate shall be convertible to bool"); 430 | 431 | assert(begin <= end); 432 | const std::size_t n = static_cast(end - begin); 433 | if (n < basecase_size) { 434 | // base case 435 | stl_sort(begin, end, compare); 436 | std::move(begin, end, out_begin); 437 | return; 438 | } 439 | 440 | auto bktout = std::make_unique(n); 441 | ssssort_int(begin, end, out_begin, compare, bktout.get(), false); 442 | } 443 | 444 | /** 445 | * Sort the range [begin, end). 446 | * 447 | * Uses <= 3*(end-begin)*sizeof(value_type) bytes of additional memory 448 | */ 449 | template > 450 | void ssssort(Iterator begin, Iterator end, Compare compare = {}) { 451 | using value_type = typename std::iterator_traits::value_type; 452 | static_assert(std::is_convertible>::value, 453 | "the result of the predicate shall be convertible to bool"); 454 | 455 | assert(begin <= end); 456 | const std::size_t n = static_cast(end - begin); 457 | 458 | if (n < basecase_size) { 459 | // base case 460 | stl_sort(begin, end, compare); 461 | return; 462 | } 463 | 464 | auto out = std::make_unique(n); 465 | auto bktout = std::make_unique(n); 466 | ssssort_int(begin, end, out.get(), compare, bktout.get(), true); 467 | } 468 | 469 | } 470 | -------------------------------------------------------------------------------- /stats.txt: -------------------------------------------------------------------------------- 1 | RESULT algo=ssssort name=warmup size=1024 iters=1*3 time=0.0226667 stddev=0.00057735 t_gen=0.005 t_check=0 ok=1 2 | RESULT algo=stdsort name=warmup size=1024 iters=1*3 time=0.0283333 stddev=0.00251661 t_gen=0.005 t_check=0 ok=1 3 | RESULT algo=ssssort name=warmup size=2048 iters=1*3 time=0.0406667 stddev=0.00057735 t_gen=0.01 t_check=0.004 ok=1 4 | RESULT algo=stdsort name=warmup size=2048 iters=1*3 time=0.134 stddev=0.004 t_gen=0.01 t_check=0 ok=1 5 | RESULT algo=ssssort name=warmup size=4096 iters=1*3 time=0.17 stddev=0 t_gen=0.038 t_check=0.004 ok=1 6 | RESULT algo=stdsort name=warmup size=4096 iters=1*3 time=0.154 stddev=0.0207846 t_gen=0.038 t_check=0 ok=1 7 | RESULT algo=ssssort name=warmup size=8192 iters=1*3 time=0.185333 stddev=0.00057735 t_gen=0.035 t_check=0.008 ok=1 8 | RESULT algo=stdsort name=warmup size=8192 iters=1*3 time=0.308333 stddev=0.0011547 t_gen=0.035 t_check=0 ok=1 9 | RESULT algo=ssssort name=warmup size=16384 iters=1*3 time=0.41 stddev=0.001 t_gen=0.069 t_check=0.016 ok=1 10 | RESULT algo=stdsort name=warmup size=16384 iters=1*3 time=0.67 stddev=0.0177764 t_gen=0.069 t_check=0 ok=1 11 | RESULT algo=ssssort name=warmup size=32768 iters=1*3 time=0.921667 stddev=0.0366924 t_gen=0.133 t_check=0.034 ok=1 12 | RESULT algo=stdsort name=warmup size=32768 iters=1*3 time=1.433 stddev=0.0149332 t_gen=0.133 t_check=0 ok=1 13 | RESULT algo=ssssort name=warmup size=65536 iters=1*3 time=1.987 stddev=0.0121244 t_gen=0.307 t_check=0.069 ok=1 14 | RESULT algo=stdsort name=warmup size=65536 iters=1*3 time=3.04633 stddev=0.0189033 t_gen=0.307 t_check=0 ok=1 15 | RESULT algo=ssssort name=warmup size=131072 iters=1*3 time=4.21667 stddev=0.0833447 t_gen=0.616 t_check=0.105 ok=1 16 | RESULT algo=stdsort name=warmup size=131072 iters=1*3 time=6.414 stddev=0.031 t_gen=0.616 t_check=0 ok=1 17 | RESULT algo=ssssort name=warmup size=262144 iters=1*3 time=7.814 stddev=0.0650461 t_gen=1.243 t_check=0.211 ok=1 18 | RESULT algo=stdsort name=warmup size=262144 iters=1*3 time=13.1773 stddev=0.0212211 t_gen=1.243 t_check=0 ok=1 19 | RESULT algo=ssssort name=warmup size=524288 iters=1*3 time=13.8923 stddev=0.109974 t_gen=2.442 t_check=0.443 ok=1 20 | RESULT algo=stdsort name=warmup size=524288 iters=1*3 time=28.3967 stddev=0.0150111 t_gen=2.442 t_check=0 ok=1 21 | RESULT algo=ssssort name=warmup size=1048576 iters=1*3 time=29.4363 stddev=0.200645 t_gen=5.125 t_check=1.015 ok=1 22 | RESULT algo=stdsort name=warmup size=1048576 iters=1*3 time=59.5397 stddev=0.04389 t_gen=5.125 t_check=0 ok=1 23 | RESULT algo=ssssort name=warmup size=2097152 iters=1*3 time=63.3517 stddev=0.400455 t_gen=10.466 t_check=2.122 ok=1 24 | RESULT algo=stdsort name=warmup size=2097152 iters=1*3 time=125.397 stddev=0.0390043 t_gen=10.466 t_check=0 ok=1 25 | RESULT algo=ssssort name=warmup size=4194304 iters=1*3 time=135.99 stddev=0.81358 t_gen=21.296 t_check=4.352 ok=1 26 | RESULT algo=stdsort name=warmup size=4194304 iters=1*3 time=259.532 stddev=0.0920887 t_gen=21.296 t_check=0 ok=1 27 | RESULT algo=ssssort name=warmup size=8388608 iters=1*3 time=295.645 stddev=0.245508 t_gen=42.686 t_check=8.712 ok=1 28 | RESULT algo=stdsort name=warmup size=8388608 iters=1*3 time=543.724 stddev=0.123779 t_gen=42.686 t_check=0 ok=1 29 | RESULT algo=ssssort name=warmup size=16777216 iters=1*3 time=632.369 stddev=0.392128 t_gen=85.644 t_check=17.106 ok=1 30 | RESULT algo=stdsort name=warmup size=16777216 iters=1*3 time=1130.42 stddev=0.12885 t_gen=85.644 t_check=0 ok=1 31 | RESULT algo=ssssort name=warmup size=33554432 iters=1*3 time=1319.52 stddev=5.3736 t_gen=173.232 t_check=33.459 ok=1 32 | RESULT algo=stdsort name=warmup size=33554432 iters=1*3 time=2376.34 stddev=10.1718 t_gen=173.232 t_check=0 ok=1 33 | RESULT algo=ssssort name=warmup size=67108864 iters=1*3 time=2452.22 stddev=1.78877 t_gen=343.025 t_check=65.751 ok=1 34 | RESULT algo=stdsort name=warmup size=67108864 iters=1*3 time=4979.01 stddev=77.0363 t_gen=343.025 t_check=0 ok=1 35 | RESULT algo=ssssort name=random size=1024 iters=100*10 time=0.02165 stddev=0.000732145 t_gen=0.501 t_check=0 ok=1 36 | RESULT algo=stdsort name=random size=1024 iters=100*10 time=0.023515 stddev=0.00308339 t_gen=0.501 t_check=0 ok=1 37 | RESULT algo=ssssort name=random size=2048 iters=100*10 time=0.039035 stddev=0.0010202 t_gen=0.801 t_check=0.1 ok=1 38 | RESULT algo=stdsort name=random size=2048 iters=100*10 time=0.060301 stddev=0.00318756 t_gen=0.801 t_check=0 ok=1 39 | RESULT algo=ssssort name=random size=4096 iters=100*10 time=0.092174 stddev=0.0103017 t_gen=1.822 t_check=0.349 ok=1 40 | RESULT algo=stdsort name=random size=4096 iters=100*10 time=0.150291 stddev=0.0116816 t_gen=1.822 t_check=0 ok=1 41 | RESULT algo=ssssort name=random size=8192 iters=100*10 time=0.225211 stddev=0.00342874 t_gen=4.406 t_check=0.878 ok=1 42 | RESULT algo=stdsort name=random size=8192 iters=100*10 time=0.35711 stddev=0.00378738 t_gen=4.406 t_check=0 ok=1 43 | RESULT algo=ssssort name=random size=16384 iters=100*10 time=0.414465 stddev=0.0220914 t_gen=5.962 t_check=1.333 ok=1 44 | RESULT algo=stdsort name=random size=16384 iters=100*10 time=0.670485 stddev=0.025558 t_gen=5.962 t_check=0 ok=1 45 | RESULT algo=ssssort name=random size=32768 iters=100*10 time=0.936515 stddev=0.0779015 t_gen=12.653 t_check=2.868 ok=1 46 | RESULT algo=stdsort name=random size=32768 iters=100*10 time=1.46997 stddev=0.0934528 t_gen=12.653 t_check=0 ok=1 47 | RESULT algo=ssssort name=random size=65536 iters=50*10 time=1.99009 stddev=0.0934163 t_gen=11.755 t_check=2.696 ok=1 48 | RESULT algo=stdsort name=random size=65536 iters=50*10 time=3.02734 stddev=0.113548 t_gen=11.755 t_check=0 ok=1 49 | RESULT algo=ssssort name=random size=131072 iters=50*10 time=4.37991 stddev=0.360608 t_gen=25.289 t_check=5.772 ok=1 50 | RESULT algo=stdsort name=random size=131072 iters=50*10 time=6.58336 stddev=0.426121 t_gen=25.289 t_check=0 ok=1 51 | RESULT algo=ssssort name=random size=262144 iters=35*10 time=8.16563 stddev=0.746955 t_gen=35.017 t_check=7.94 ok=1 52 | RESULT algo=stdsort name=random size=262144 iters=35*10 time=13.8095 stddev=0.830522 t_gen=35.017 t_check=0 ok=1 53 | RESULT algo=ssssort name=random size=524288 iters=35*10 time=13.8847 stddev=0.0511877 t_gen=64.189 t_check=15.593 ok=1 54 | RESULT algo=stdsort name=random size=524288 iters=35*10 time=28.2107 stddev=0.189309 t_gen=64.189 t_check=0 ok=1 55 | RESULT algo=ssssort name=random size=1048576 iters=35*10 time=29.3272 stddev=0.158011 t_gen=127.715 t_check=32.994 ok=1 56 | RESULT algo=stdsort name=random size=1048576 iters=35*10 time=59.3689 stddev=0.343949 t_gen=127.715 t_check=0 ok=1 57 | RESULT algo=ssssort name=random size=2097152 iters=35*10 time=63.0449 stddev=0.155006 t_gen=263.963 t_check=74.007 ok=1 58 | RESULT algo=stdsort name=random size=2097152 iters=35*10 time=124.045 stddev=0.663766 t_gen=263.963 t_check=0 ok=1 59 | RESULT algo=ssssort name=random size=4194304 iters=25*10 time=134.401 stddev=0.225284 t_gen=392.696 t_check=103.792 ok=1 60 | RESULT algo=stdsort name=random size=4194304 iters=25*10 time=260.853 stddev=1.35765 t_gen=392.696 t_check=0 ok=1 61 | RESULT algo=ssssort name=random size=8388608 iters=25*10 time=293.984 stddev=0.119949 t_gen=778.284 t_check=203.265 ok=1 62 | RESULT algo=stdsort name=random size=8388608 iters=25*10 time=544.884 stddev=7.01739 t_gen=778.284 t_check=0 ok=1 63 | RESULT algo=ssssort name=random size=16777216 iters=25*10 time=632.297 stddev=13.5124 t_gen=1573.56 t_check=383.103 ok=1 64 | RESULT algo=stdsort name=random size=16777216 iters=25*10 time=1135.81 stddev=19.0557 t_gen=1573.56 t_check=0 ok=1 65 | RESULT algo=ssssort name=random size=33554432 iters=25*10 time=1319.78 stddev=1.51531 t_gen=3101.21 t_check=831.641 ok=1 66 | RESULT algo=stdsort name=random size=33554432 iters=25*10 time=2366.81 stddev=26.3953 t_gen=3101.21 t_check=0 ok=1 67 | RESULT algo=ssssort name=random size=67108864 iters=25*10 time=2465.77 stddev=79.2151 t_gen=6189.78 t_check=1584.47 ok=1 68 | RESULT algo=stdsort name=random size=67108864 iters=25*10 time=4941.05 stddev=117.425 t_gen=6189.78 t_check=0 ok=1 69 | RESULT algo=ssssort name=80pcsorted size=1024 iters=100*10 time=0.020138 stddev=0.000592712 t_gen=0.501 t_check=0 ok=1 70 | RESULT algo=stdsort name=80pcsorted size=1024 iters=100*10 time=0.01364 stddev=0.00256612 t_gen=0.501 t_check=0 ok=1 71 | RESULT algo=ssssort name=80pcsorted size=2048 iters=100*10 time=0.033177 stddev=0.000849937 t_gen=1 t_check=0.102 ok=1 72 | RESULT algo=stdsort name=80pcsorted size=2048 iters=100*10 time=0.035579 stddev=0.00318432 t_gen=1 t_check=0 ok=1 73 | RESULT algo=ssssort name=80pcsorted size=4096 iters=100*10 time=0.065583 stddev=0.00124847 t_gen=1.802 t_check=0.31 ok=1 74 | RESULT algo=stdsort name=80pcsorted size=4096 iters=100*10 time=0.082759 stddev=0.00445137 t_gen=1.802 t_check=0 ok=1 75 | RESULT algo=ssssort name=80pcsorted size=8192 iters=100*10 time=0.140844 stddev=0.00252786 t_gen=3.546 t_check=0.61 ok=1 76 | RESULT algo=stdsort name=80pcsorted size=8192 iters=100*10 time=0.18491 stddev=0.00685178 t_gen=3.546 t_check=0 ok=1 77 | RESULT algo=ssssort name=80pcsorted size=16384 iters=100*10 time=0.304919 stddev=0.00369866 t_gen=6.927 t_check=1.329 ok=1 78 | RESULT algo=stdsort name=80pcsorted size=16384 iters=100*10 time=0.398435 stddev=0.0151918 t_gen=6.927 t_check=0 ok=1 79 | RESULT algo=ssssort name=80pcsorted size=32768 iters=100*10 time=0.660923 stddev=0.00447337 t_gen=13.885 t_check=2.704 ok=1 80 | RESULT algo=stdsort name=80pcsorted size=32768 iters=100*10 time=0.844344 stddev=0.0336834 t_gen=13.885 t_check=0 ok=1 81 | RESULT algo=ssssort name=80pcsorted size=65536 iters=50*10 time=1.40546 stddev=0.00748253 t_gen=13.834 t_check=2.752 ok=1 82 | RESULT algo=stdsort name=80pcsorted size=65536 iters=50*10 time=1.77708 stddev=0.0646128 t_gen=13.834 t_check=0 ok=1 83 | RESULT algo=ssssort name=80pcsorted size=131072 iters=50*10 time=3.00434 stddev=0.03017 t_gen=27.804 t_check=5.542 ok=1 84 | RESULT algo=stdsort name=80pcsorted size=131072 iters=50*10 time=3.82834 stddev=0.191887 t_gen=27.804 t_check=0 ok=1 85 | RESULT algo=ssssort name=80pcsorted size=262144 iters=35*10 time=6.31566 stddev=0.0263057 t_gen=38.922 t_check=8.467 ok=1 86 | RESULT algo=stdsort name=80pcsorted size=262144 iters=35*10 time=8.01887 stddev=0.337457 t_gen=38.922 t_check=0 ok=1 87 | RESULT algo=ssssort name=80pcsorted size=524288 iters=35*10 time=11.933 stddev=0.019105 t_gen=78.226 t_check=15.361 ok=1 88 | RESULT algo=stdsort name=80pcsorted size=524288 iters=35*10 time=16.6368 stddev=0.642663 t_gen=78.226 t_check=0 ok=1 89 | RESULT algo=ssssort name=80pcsorted size=1048576 iters=35*10 time=23.8228 stddev=0.0314114 t_gen=155.918 t_check=33.797 ok=1 90 | RESULT algo=stdsort name=80pcsorted size=1048576 iters=35*10 time=35.5402 stddev=1.44242 t_gen=155.918 t_check=0 ok=1 91 | RESULT algo=ssssort name=80pcsorted size=2097152 iters=35*10 time=49.7581 stddev=0.0441599 t_gen=313.779 t_check=72.975 ok=1 92 | RESULT algo=stdsort name=80pcsorted size=2097152 iters=35*10 time=74.0804 stddev=3.70768 t_gen=313.779 t_check=0 ok=1 93 | RESULT algo=ssssort name=80pcsorted size=4194304 iters=25*10 time=105.552 stddev=4.47134 t_gen=475.259 t_check=102.921 ok=1 94 | RESULT algo=stdsort name=80pcsorted size=4194304 iters=25*10 time=158.548 stddev=8.72768 t_gen=475.259 t_check=0 ok=1 95 | RESULT algo=ssssort name=80pcsorted size=8388608 iters=25*10 time=228.917 stddev=0.227619 t_gen=943.125 t_check=199.553 ok=1 96 | RESULT algo=stdsort name=80pcsorted size=8388608 iters=25*10 time=327.744 stddev=13.0388 t_gen=943.125 t_check=0 ok=1 97 | RESULT algo=ssssort name=80pcsorted size=16777216 iters=25*10 time=478.376 stddev=1.43545 t_gen=1886.28 t_check=380.601 ok=1 98 | RESULT algo=stdsort name=80pcsorted size=16777216 iters=25*10 time=692.664 stddev=29.8606 t_gen=1886.28 t_check=0 ok=1 99 | RESULT algo=ssssort name=80pcsorted size=33554432 iters=25*10 time=1007.05 stddev=24.1143 t_gen=3833.18 t_check=836.493 ok=1 100 | RESULT algo=stdsort name=80pcsorted size=33554432 iters=25*10 time=1457.68 stddev=45.9682 t_gen=3833.18 t_check=0 ok=1 101 | RESULT algo=ssssort name=80pcsorted size=67108864 iters=25*10 time=2069.43 stddev=55.6648 t_gen=7660.87 t_check=1628.59 ok=1 102 | RESULT algo=stdsort name=80pcsorted size=67108864 iters=25*10 time=3100.56 stddev=118.858 t_gen=7660.87 t_check=0 ok=1 103 | RESULT algo=ssssort name=90pcsorted size=1024 iters=100*10 time=0.019564 stddev=0.000546079 t_gen=0.4 t_check=0.001 ok=1 104 | RESULT algo=stdsort name=90pcsorted size=1024 iters=100*10 time=0.011028 stddev=0.00237547 t_gen=0.4 t_check=0 ok=1 105 | RESULT algo=ssssort name=90pcsorted size=2048 iters=100*10 time=0.031114 stddev=0.000731806 t_gen=0.6 t_check=0.104 ok=1 106 | RESULT algo=stdsort name=90pcsorted size=2048 iters=100*10 time=0.028949 stddev=0.00416434 t_gen=0.6 t_check=0 ok=1 107 | RESULT algo=ssssort name=90pcsorted size=4096 iters=100*10 time=0.059272 stddev=0.00114688 t_gen=1.102 t_check=0.319 ok=1 108 | RESULT algo=stdsort name=90pcsorted size=4096 iters=100*10 time=0.067679 stddev=0.00558419 t_gen=1.102 t_check=0 ok=1 109 | RESULT algo=ssssort name=90pcsorted size=8192 iters=100*10 time=0.125764 stddev=0.00215103 t_gen=2.001 t_check=0.644 ok=1 110 | RESULT algo=stdsort name=90pcsorted size=8192 iters=100*10 time=0.151834 stddev=0.00766846 t_gen=2.001 t_check=0 ok=1 111 | RESULT algo=ssssort name=90pcsorted size=16384 iters=100*10 time=0.273216 stddev=0.00396271 t_gen=3.908 t_check=1.38 ok=1 112 | RESULT algo=stdsort name=90pcsorted size=16384 iters=100*10 time=0.328203 stddev=0.014319 t_gen=3.908 t_check=0 ok=1 113 | RESULT algo=ssssort name=90pcsorted size=32768 iters=100*10 time=0.602353 stddev=0.0097406 t_gen=7.808 t_check=2.841 ok=1 114 | RESULT algo=stdsort name=90pcsorted size=32768 iters=100*10 time=0.699831 stddev=0.0249691 t_gen=7.808 t_check=0 ok=1 115 | RESULT algo=ssssort name=90pcsorted size=65536 iters=50*10 time=1.27291 stddev=0.0141024 t_gen=7.798 t_check=2.871 ok=1 116 | RESULT algo=stdsort name=90pcsorted size=65536 iters=50*10 time=1.46529 stddev=0.0580091 t_gen=7.798 t_check=0 ok=1 117 | RESULT algo=ssssort name=90pcsorted size=131072 iters=50*10 time=2.73127 stddev=0.0317443 t_gen=15.626 t_check=5.963 ok=1 118 | RESULT algo=stdsort name=90pcsorted size=131072 iters=50*10 time=3.07663 stddev=0.121194 t_gen=15.626 t_check=0 ok=1 119 | RESULT algo=ssssort name=90pcsorted size=262144 iters=35*10 time=5.99253 stddev=0.0361599 t_gen=21.906 t_check=8.373 ok=1 120 | RESULT algo=stdsort name=90pcsorted size=262144 iters=35*10 time=6.46105 stddev=0.275658 t_gen=21.906 t_check=0 ok=1 121 | RESULT algo=ssssort name=90pcsorted size=524288 iters=35*10 time=11.4855 stddev=0.0217949 t_gen=44.169 t_check=15.251 ok=1 122 | RESULT algo=stdsort name=90pcsorted size=524288 iters=35*10 time=13.6837 stddev=1.06916 t_gen=44.169 t_check=0 ok=1 123 | RESULT algo=ssssort name=90pcsorted size=1048576 iters=35*10 time=22.3479 stddev=0.0325837 t_gen=88.233 t_check=32.524 ok=1 124 | RESULT algo=stdsort name=90pcsorted size=1048576 iters=35*10 time=28.5204 stddev=1.08987 t_gen=88.233 t_check=0 ok=1 125 | RESULT algo=ssssort name=90pcsorted size=2097152 iters=35*10 time=45.8717 stddev=0.109925 t_gen=179.47 t_check=72.602 ok=1 126 | RESULT algo=stdsort name=90pcsorted size=2097152 iters=35*10 time=59.5213 stddev=2.26522 t_gen=179.47 t_check=0 ok=1 127 | RESULT algo=ssssort name=90pcsorted size=4194304 iters=25*10 time=98.1055 stddev=6.40846 t_gen=281.822 t_check=104.166 ok=1 128 | RESULT algo=stdsort name=90pcsorted size=4194304 iters=25*10 time=128.718 stddev=7.99597 t_gen=281.822 t_check=0 ok=1 129 | RESULT algo=ssssort name=90pcsorted size=8388608 iters=25*10 time=211.967 stddev=0.145249 t_gen=555.949 t_check=199.97 ok=1 130 | RESULT algo=stdsort name=90pcsorted size=8388608 iters=25*10 time=262.095 stddev=9.1112 t_gen=555.949 t_check=0 ok=1 131 | RESULT algo=ssssort name=90pcsorted size=16777216 iters=25*10 time=441.302 stddev=0.509905 t_gen=1110.7 t_check=375.472 ok=1 132 | RESULT algo=stdsort name=90pcsorted size=16777216 iters=25*10 time=554.536 stddev=16.7027 t_gen=1110.7 t_check=0 ok=1 133 | RESULT algo=ssssort name=90pcsorted size=33554432 iters=25*10 time=938.315 stddev=47.4134 t_gen=2264.34 t_check=837.168 ok=1 134 | RESULT algo=stdsort name=90pcsorted size=33554432 iters=25*10 time=1191.13 stddev=41.2331 t_gen=2264.34 t_check=0 ok=1 135 | RESULT algo=ssssort name=90pcsorted size=67108864 iters=25*10 time=1977.35 stddev=31.5028 t_gen=4491.97 t_check=1617.61 ok=1 136 | RESULT algo=stdsort name=90pcsorted size=67108864 iters=25*10 time=2516.54 stddev=64.0046 t_gen=4491.97 t_check=0 ok=1 137 | RESULT algo=ssssort name=99pcsorted size=1024 iters=100*10 time=0.019107 stddev=0.00072116 t_gen=0.2 t_check=0 ok=1 138 | RESULT algo=stdsort name=99pcsorted size=1024 iters=100*10 time=0.007594 stddev=0.004076 t_gen=0.2 t_check=0 ok=1 139 | RESULT algo=ssssort name=99pcsorted size=2048 iters=100*10 time=0.028723 stddev=0.000696244 t_gen=0.3 t_check=0.1 ok=1 140 | RESULT algo=stdsort name=99pcsorted size=2048 iters=100*10 time=0.021027 stddev=0.00518412 t_gen=0.3 t_check=0 ok=1 141 | RESULT algo=ssssort name=99pcsorted size=4096 iters=100*10 time=0.050409 stddev=0.00103769 t_gen=0.4 t_check=0.311 ok=1 142 | RESULT algo=stdsort name=99pcsorted size=4096 iters=100*10 time=0.056715 stddev=0.0123744 t_gen=0.4 t_check=0 ok=1 143 | RESULT algo=ssssort name=99pcsorted size=8192 iters=100*10 time=0.098811 stddev=0.00243006 t_gen=0.701 t_check=0.63 ok=1 144 | RESULT algo=stdsort name=99pcsorted size=8192 iters=100*10 time=0.143735 stddev=0.0275058 t_gen=0.701 t_check=0 ok=1 145 | RESULT algo=ssssort name=99pcsorted size=16384 iters=100*10 time=0.212404 stddev=0.00638542 t_gen=1.202 t_check=1.363 ok=1 146 | RESULT algo=stdsort name=99pcsorted size=16384 iters=100*10 time=0.331255 stddev=0.0516303 t_gen=1.202 t_check=0 ok=1 147 | RESULT algo=ssssort name=99pcsorted size=32768 iters=100*10 time=0.556795 stddev=0.0160199 t_gen=2.337 t_check=2.84 ok=1 148 | RESULT algo=stdsort name=99pcsorted size=32768 iters=100*10 time=0.744774 stddev=0.100698 t_gen=2.337 t_check=0 ok=1 149 | RESULT algo=ssssort name=99pcsorted size=65536 iters=50*10 time=1.3175 stddev=0.0348127 t_gen=2.34 t_check=2.965 ok=1 150 | RESULT algo=stdsort name=99pcsorted size=65536 iters=50*10 time=1.59271 stddev=0.18394 t_gen=2.34 t_check=0 ok=1 151 | RESULT algo=ssssort name=99pcsorted size=131072 iters=50*10 time=2.98073 stddev=0.0724767 t_gen=4.798 t_check=6.435 ok=1 152 | RESULT algo=stdsort name=99pcsorted size=131072 iters=50*10 time=3.33804 stddev=0.43164 t_gen=4.798 t_check=0 ok=1 153 | RESULT algo=ssssort name=99pcsorted size=262144 iters=35*10 time=6.15912 stddev=0.0644016 t_gen=6.548 t_check=8.036 ok=1 154 | RESULT algo=stdsort name=99pcsorted size=262144 iters=35*10 time=6.78389 stddev=0.814735 t_gen=6.548 t_check=0 ok=1 155 | RESULT algo=ssssort name=99pcsorted size=524288 iters=35*10 time=10.941 stddev=0.030309 t_gen=13.468 t_check=15.568 ok=1 156 | RESULT algo=stdsort name=99pcsorted size=524288 iters=35*10 time=14.5766 stddev=2.19981 t_gen=13.468 t_check=0 ok=1 157 | RESULT algo=ssssort name=99pcsorted size=1048576 iters=35*10 time=20.422 stddev=1.11687 t_gen=26.952 t_check=34.321 ok=1 158 | RESULT algo=stdsort name=99pcsorted size=1048576 iters=35*10 time=30.2656 stddev=4.7988 t_gen=26.952 t_check=0 ok=1 159 | RESULT algo=ssssort name=99pcsorted size=2097152 iters=35*10 time=39.8266 stddev=0.0607296 t_gen=58.809 t_check=73.04 ok=1 160 | RESULT algo=stdsort name=99pcsorted size=2097152 iters=35*10 time=60.9253 stddev=5.61643 t_gen=58.809 t_check=0 ok=1 161 | RESULT algo=ssssort name=99pcsorted size=4194304 iters=25*10 time=83.8774 stddev=0.16774 t_gen=103.352 t_check=102.512 ok=1 162 | RESULT algo=stdsort name=99pcsorted size=4194304 iters=25*10 time=122.332 stddev=13.8283 t_gen=103.352 t_check=0 ok=1 163 | RESULT algo=ssssort name=99pcsorted size=8388608 iters=25*10 time=203.876 stddev=0.292582 t_gen=207.316 t_check=200.124 ok=1 164 | RESULT algo=stdsort name=99pcsorted size=8388608 iters=25*10 time=265.466 stddev=18.9113 t_gen=207.316 t_check=0 ok=1 165 | RESULT algo=ssssort name=99pcsorted size=16777216 iters=25*10 time=471.286 stddev=24.9994 t_gen=419.317 t_check=413.75 ok=1 166 | RESULT algo=stdsort name=99pcsorted size=16777216 iters=25*10 time=576.595 stddev=54.7584 t_gen=419.317 t_check=0 ok=1 167 | RESULT algo=ssssort name=99pcsorted size=33554432 iters=25*10 time=1010.55 stddev=27.0742 t_gen=836.959 t_check=822.547 ok=1 168 | RESULT algo=stdsort name=99pcsorted size=33554432 iters=25*10 time=1163.68 stddev=126.603 t_gen=836.959 t_check=0 ok=1 169 | RESULT algo=ssssort name=99pcsorted size=67108864 iters=25*10 time=2041.35 stddev=29.8247 t_gen=1684.4 t_check=1631.32 ok=1 170 | RESULT algo=stdsort name=99pcsorted size=67108864 iters=25*10 time=2604.54 stddev=387.75 t_gen=1684.4 t_check=0 ok=1 171 | RESULT algo=ssssort name=99.9pcsorted size=1024 iters=100*10 time=0.018933 stddev=0.000552101 t_gen=0.2 t_check=0.005 ok=1 172 | RESULT algo=stdsort name=99.9pcsorted size=1024 iters=100*10 time=0.004666 stddev=0.00103029 t_gen=0.2 t_check=0 ok=1 173 | RESULT algo=ssssort name=99.9pcsorted size=2048 iters=100*10 time=0.028334 stddev=0.000714811 t_gen=0.226 t_check=0.102 ok=1 174 | RESULT algo=stdsort name=99.9pcsorted size=2048 iters=100*10 time=0.012469 stddev=0.00320863 t_gen=0.226 t_check=0 ok=1 175 | RESULT algo=ssssort name=99.9pcsorted size=4096 iters=100*10 time=0.04873 stddev=0.000905494 t_gen=0.301 t_check=0.302 ok=1 176 | RESULT algo=stdsort name=99.9pcsorted size=4096 iters=100*10 time=0.036193 stddev=0.0100791 t_gen=0.301 t_check=0 ok=1 177 | RESULT algo=ssssort name=99.9pcsorted size=8192 iters=100*10 time=0.092489 stddev=0.00166753 t_gen=0.591 t_check=0.602 ok=1 178 | RESULT algo=stdsort name=99.9pcsorted size=8192 iters=100*10 time=0.106762 stddev=0.0326891 t_gen=0.591 t_check=0 ok=1 179 | RESULT algo=ssssort name=99.9pcsorted size=16384 iters=100*10 time=0.186441 stddev=0.00292422 t_gen=0.911 t_check=1.321 ok=1 180 | RESULT algo=stdsort name=99.9pcsorted size=16384 iters=100*10 time=0.298806 stddev=0.0902639 t_gen=0.911 t_check=0 ok=1 181 | RESULT algo=ssssort name=99.9pcsorted size=32768 iters=100*10 time=0.527222 stddev=0.0191273 t_gen=1.742 t_check=2.71 ok=1 182 | RESULT algo=stdsort name=99.9pcsorted size=32768 iters=100*10 time=0.73965 stddev=0.198224 t_gen=1.742 t_check=0 ok=1 183 | RESULT algo=ssssort name=99.9pcsorted size=65536 iters=50*10 time=1.33265 stddev=0.0464879 t_gen=1.777 t_check=2.822 ok=1 184 | RESULT algo=stdsort name=99.9pcsorted size=65536 iters=50*10 time=1.74431 stddev=0.364542 t_gen=1.777 t_check=0 ok=1 185 | RESULT algo=ssssort name=99.9pcsorted size=131072 iters=50*10 time=3.08562 stddev=0.0868162 t_gen=3.614 t_check=5.617 ok=1 186 | RESULT algo=stdsort name=99.9pcsorted size=131072 iters=50*10 time=4.04175 stddev=0.699333 t_gen=3.614 t_check=0 ok=1 187 | RESULT algo=ssssort name=99.9pcsorted size=262144 iters=35*10 time=6.34807 stddev=0.0835428 t_gen=4.941 t_check=7.639 ok=1 188 | RESULT algo=stdsort name=99.9pcsorted size=262144 iters=35*10 time=8.48257 stddev=1.42268 t_gen=4.941 t_check=0 ok=1 189 | RESULT algo=ssssort name=99.9pcsorted size=524288 iters=35*10 time=10.8753 stddev=0.0345017 t_gen=10.594 t_check=15.412 ok=1 190 | RESULT algo=stdsort name=99.9pcsorted size=524288 iters=35*10 time=18.2935 stddev=2.47157 t_gen=10.594 t_check=0 ok=1 191 | RESULT algo=ssssort name=99.9pcsorted size=1048576 iters=35*10 time=19.8304 stddev=0.0246053 t_gen=20.493 t_check=33.691 ok=1 192 | RESULT algo=stdsort name=99.9pcsorted size=1048576 iters=35*10 time=37.7995 stddev=5.48426 t_gen=20.493 t_check=0 ok=1 193 | RESULT algo=ssssort name=99.9pcsorted size=2097152 iters=35*10 time=38.1612 stddev=0.0466065 t_gen=47.873 t_check=72.773 ok=1 194 | RESULT algo=stdsort name=99.9pcsorted size=2097152 iters=35*10 time=81.2098 stddev=12.8085 t_gen=47.873 t_check=0 ok=1 195 | RESULT algo=ssssort name=99.9pcsorted size=4194304 iters=25*10 time=76.762 stddev=2.27163 t_gen=87.309 t_check=103.281 ok=1 196 | RESULT algo=stdsort name=99.9pcsorted size=4194304 iters=25*10 time=175.529 stddev=27.8304 t_gen=87.309 t_check=0 ok=1 197 | RESULT algo=ssssort name=99.9pcsorted size=8388608 iters=25*10 time=198.104 stddev=11.757 t_gen=174.775 t_check=203.733 ok=1 198 | RESULT algo=stdsort name=99.9pcsorted size=8388608 iters=25*10 time=352.768 stddev=44.1476 t_gen=174.775 t_check=0 ok=1 199 | RESULT algo=ssssort name=99.9pcsorted size=16777216 iters=25*10 time=454.997 stddev=7.37176 t_gen=342.576 t_check=378.553 ok=1 200 | RESULT algo=stdsort name=99.9pcsorted size=16777216 iters=25*10 time=770.851 stddev=104.481 t_gen=342.576 t_check=0 ok=1 201 | RESULT algo=ssssort name=99.9pcsorted size=33554432 iters=25*10 time=1016.1 stddev=17.4795 t_gen=696.946 t_check=823.731 ok=1 202 | RESULT algo=stdsort name=99.9pcsorted size=33554432 iters=25*10 time=1642.05 stddev=191.792 t_gen=696.946 t_check=0 ok=1 203 | RESULT algo=ssssort name=99.9pcsorted size=67108864 iters=25*10 time=2085.33 stddev=62.7843 t_gen=1400.6 t_check=1638.82 ok=1 204 | RESULT algo=stdsort name=99.9pcsorted size=67108864 iters=25*10 time=3521.56 stddev=319.656 t_gen=1400.6 t_check=0 ok=1 205 | RESULT algo=ssssort name=tail90 size=1024 iters=100*10 time=0.019434 stddev=0.00061157 t_gen=0.201 t_check=0 ok=1 206 | RESULT algo=stdsort name=tail90 size=1024 iters=100*10 time=0.011224 stddev=0.00248596 t_gen=0.201 t_check=0 ok=1 207 | RESULT algo=ssssort name=tail90 size=2048 iters=100*10 time=0.03018 stddev=0.000723964 t_gen=0.3 t_check=0.104 ok=1 208 | RESULT algo=stdsort name=tail90 size=2048 iters=100*10 time=0.029826 stddev=0.00340783 t_gen=0.3 t_check=0 ok=1 209 | RESULT algo=ssssort name=tail90 size=4096 iters=100*10 time=0.057874 stddev=0.00145198 t_gen=0.401 t_check=0.312 ok=1 210 | RESULT algo=stdsort name=tail90 size=4096 iters=100*10 time=0.072151 stddev=0.006528 t_gen=0.401 t_check=0 ok=1 211 | RESULT algo=ssssort name=tail90 size=8192 iters=100*10 time=0.124777 stddev=0.00188436 t_gen=0.8 t_check=0.658 ok=1 212 | RESULT algo=stdsort name=tail90 size=8192 iters=100*10 time=0.164629 stddev=0.0117803 t_gen=0.8 t_check=0 ok=1 213 | RESULT algo=ssssort name=tail90 size=16384 iters=100*10 time=0.274294 stddev=0.00278951 t_gen=1.394 t_check=1.368 ok=1 214 | RESULT algo=stdsort name=tail90 size=16384 iters=100*10 time=0.355064 stddev=0.0178036 t_gen=1.394 t_check=0 ok=1 215 | RESULT algo=ssssort name=tail90 size=32768 iters=100*10 time=0.609121 stddev=0.00686699 t_gen=2.655 t_check=2.776 ok=1 216 | RESULT algo=stdsort name=tail90 size=32768 iters=100*10 time=0.754939 stddev=0.0318701 t_gen=2.655 t_check=0 ok=1 217 | RESULT algo=ssssort name=tail90 size=65536 iters=50*10 time=1.31615 stddev=0.0171625 t_gen=2.698 t_check=2.914 ok=1 218 | RESULT algo=stdsort name=tail90 size=65536 iters=50*10 time=1.6584 stddev=0.13163 t_gen=2.698 t_check=0 ok=1 219 | RESULT algo=ssssort name=tail90 size=131072 iters=50*10 time=2.85931 stddev=0.0337854 t_gen=5.423 t_check=5.837 ok=1 220 | RESULT algo=stdsort name=tail90 size=131072 iters=50*10 time=3.39261 stddev=0.146085 t_gen=5.423 t_check=0 ok=1 221 | RESULT algo=ssssort name=tail90 size=262144 iters=35*10 time=6.12189 stddev=0.0426652 t_gen=7.635 t_check=8.467 ok=1 222 | RESULT algo=stdsort name=tail90 size=262144 iters=35*10 time=7.1499 stddev=0.225718 t_gen=7.635 t_check=0 ok=1 223 | RESULT algo=ssssort name=tail90 size=524288 iters=35*10 time=11.4691 stddev=0.0432393 t_gen=15.644 t_check=15.4 ok=1 224 | RESULT algo=stdsort name=tail90 size=524288 iters=35*10 time=14.6786 stddev=0.32739 t_gen=15.644 t_check=0 ok=1 225 | RESULT algo=ssssort name=tail90 size=1048576 iters=35*10 time=22.4333 stddev=0.08607 t_gen=30.577 t_check=33.745 ok=1 226 | RESULT algo=stdsort name=tail90 size=1048576 iters=35*10 time=30.9241 stddev=0.487763 t_gen=30.577 t_check=0 ok=1 227 | RESULT algo=ssssort name=tail90 size=2097152 iters=35*10 time=46.6022 stddev=0.167553 t_gen=67.702 t_check=72.901 ok=1 228 | RESULT algo=stdsort name=tail90 size=2097152 iters=35*10 time=65.745 stddev=2.69429 t_gen=67.702 t_check=0 ok=1 229 | RESULT algo=ssssort name=tail90 size=4194304 iters=25*10 time=98.4353 stddev=0.368853 t_gen=115.102 t_check=101.891 ok=1 230 | RESULT algo=stdsort name=tail90 size=4194304 iters=25*10 time=136.552 stddev=4.33371 t_gen=115.102 t_check=0 ok=1 231 | RESULT algo=ssssort name=tail90 size=8388608 iters=25*10 time=218.163 stddev=0.143762 t_gen=228.205 t_check=199.312 ok=1 232 | RESULT algo=stdsort name=tail90 size=8388608 iters=25*10 time=288.898 stddev=11.6433 t_gen=228.205 t_check=0 ok=1 233 | RESULT algo=ssssort name=tail90 size=16777216 iters=25*10 time=460.932 stddev=14.8245 t_gen=452.12 t_check=377.253 ok=1 234 | RESULT algo=stdsort name=tail90 size=16777216 iters=25*10 time=601.948 stddev=30.7663 t_gen=452.12 t_check=0 ok=1 235 | RESULT algo=ssssort name=tail90 size=33554432 iters=25*10 time=970.419 stddev=11.5266 t_gen=922.009 t_check=841.117 ok=1 236 | RESULT algo=stdsort name=tail90 size=33554432 iters=25*10 time=1241.53 stddev=18.4208 t_gen=922.009 t_check=0 ok=1 237 | RESULT algo=ssssort name=tail90 size=67108864 iters=25*10 time=2019.67 stddev=55.3668 t_gen=1837.42 t_check=1626.2 ok=1 238 | RESULT algo=stdsort name=tail90 size=67108864 iters=25*10 time=2611.36 stddev=97.9557 t_gen=1837.42 t_check=0 ok=1 239 | RESULT algo=ssssort name=tail99 size=1024 iters=100*10 time=0.019067 stddev=0.00056465 t_gen=0.2 t_check=0 ok=1 240 | RESULT algo=stdsort name=tail99 size=1024 iters=100*10 time=0.01009 stddev=0.00354407 t_gen=0.2 t_check=0 ok=1 241 | RESULT algo=ssssort name=tail99 size=2048 iters=100*10 time=0.028637 stddev=0.00065701 t_gen=0.241 t_check=0.108 ok=1 242 | RESULT algo=stdsort name=tail99 size=2048 iters=100*10 time=0.026727 stddev=0.00616786 t_gen=0.241 t_check=0 ok=1 243 | RESULT algo=ssssort name=tail99 size=4096 iters=100*10 time=0.050246 stddev=0.0010875 t_gen=0.315 t_check=0.314 ok=1 244 | RESULT algo=stdsort name=tail99 size=4096 iters=100*10 time=0.070249 stddev=0.0113647 t_gen=0.315 t_check=0 ok=1 245 | RESULT algo=ssssort name=tail99 size=8192 iters=100*10 time=0.09843 stddev=0.00283498 t_gen=0.6 t_check=0.64 ok=1 246 | RESULT algo=stdsort name=tail99 size=8192 iters=100*10 time=0.163774 stddev=0.0241136 t_gen=0.6 t_check=0 ok=1 247 | RESULT algo=ssssort name=tail99 size=16384 iters=100*10 time=0.211526 stddev=0.00660628 t_gen=0.95 t_check=1.332 ok=1 248 | RESULT algo=stdsort name=tail99 size=16384 iters=100*10 time=0.356388 stddev=0.0448789 t_gen=0.95 t_check=0 ok=1 249 | RESULT algo=ssssort name=tail99 size=32768 iters=100*10 time=0.555193 stddev=0.0167604 t_gen=1.792 t_check=2.816 ok=1 250 | RESULT algo=stdsort name=tail99 size=32768 iters=100*10 time=0.782215 stddev=0.102361 t_gen=1.792 t_check=0 ok=1 251 | RESULT algo=ssssort name=tail99 size=65536 iters=50*10 time=1.30864 stddev=0.0362536 t_gen=1.9 t_check=2.964 ok=1 252 | RESULT algo=stdsort name=tail99 size=65536 iters=50*10 time=1.81529 stddev=0.303655 t_gen=1.9 t_check=0 ok=1 253 | RESULT algo=ssssort name=tail99 size=131072 iters=50*10 time=2.95253 stddev=0.0752186 t_gen=3.734 t_check=5.882 ok=1 254 | RESULT algo=stdsort name=tail99 size=131072 iters=50*10 time=3.91253 stddev=0.518094 t_gen=3.734 t_check=0 ok=1 255 | RESULT algo=ssssort name=tail99 size=262144 iters=35*10 time=6.13649 stddev=0.0945973 t_gen=5.216 t_check=8.412 ok=1 256 | RESULT algo=stdsort name=tail99 size=262144 iters=35*10 time=8.58843 stddev=0.939567 t_gen=5.216 t_check=0 ok=1 257 | RESULT algo=ssssort name=tail99 size=524288 iters=35*10 time=11.3872 stddev=1.12419 t_gen=10.943 t_check=16.328 ok=1 258 | RESULT algo=stdsort name=tail99 size=524288 iters=35*10 time=18.585 stddev=2.87028 t_gen=10.943 t_check=0 ok=1 259 | RESULT algo=ssssort name=tail99 size=1048576 iters=35*10 time=21.1623 stddev=2.11787 t_gen=22.059 t_check=33.849 ok=1 260 | RESULT algo=stdsort name=tail99 size=1048576 iters=35*10 time=38.0514 stddev=5.14157 t_gen=22.059 t_check=0 ok=1 261 | RESULT algo=ssssort name=tail99 size=2097152 iters=35*10 time=39.8842 stddev=0.169678 t_gen=46.408 t_check=72.96 ok=1 262 | RESULT algo=stdsort name=tail99 size=2097152 iters=35*10 time=77.0316 stddev=6.50885 t_gen=46.408 t_check=0 ok=1 263 | RESULT algo=ssssort name=tail99 size=4194304 iters=25*10 time=83.0783 stddev=0.452522 t_gen=88.358 t_check=102.321 ok=1 264 | RESULT algo=stdsort name=tail99 size=4194304 iters=25*10 time=163.217 stddev=15.427 t_gen=88.358 t_check=0 ok=1 265 | RESULT algo=ssssort name=tail99 size=8388608 iters=25*10 time=203.861 stddev=0.427406 t_gen=174.621 t_check=200.333 ok=1 266 | RESULT algo=stdsort name=tail99 size=8388608 iters=25*10 time=347.792 stddev=29.5512 t_gen=174.621 t_check=0 ok=1 267 | RESULT algo=ssssort name=tail99 size=16777216 iters=25*10 time=455.041 stddev=0.599972 t_gen=344.718 t_check=377.997 ok=1 268 | RESULT algo=stdsort name=tail99 size=16777216 iters=25*10 time=706.422 stddev=53.6026 t_gen=344.718 t_check=0 ok=1 269 | RESULT algo=ssssort name=tail99 size=33554432 iters=25*10 time=994.404 stddev=40.4909 t_gen=697.134 t_check=828.998 ok=1 270 | RESULT algo=stdsort name=tail99 size=33554432 iters=25*10 time=1482.39 stddev=120.457 t_gen=697.134 t_check=0 ok=1 271 | RESULT algo=ssssort name=tail99 size=67108864 iters=25*10 time=2025.74 stddev=41.8374 t_gen=1401.69 t_check=1632.45 ok=1 272 | RESULT algo=stdsort name=tail99 size=67108864 iters=25*10 time=2999.09 stddev=250.982 t_gen=1401.69 t_check=0 ok=1 273 | RESULT algo=ssssort name=sorted size=1024 iters=1*1000 time=0.01905 stddev=0.000582958 t_gen=0 t_check=0 ok=1 274 | RESULT algo=stdsort name=sorted size=1024 iters=1*1000 time=0.005003 stddev=7.06824e-05 t_gen=0 t_check=0 ok=1 275 | RESULT algo=ssssort name=sorted size=2048 iters=1*1000 time=0.028025 stddev=0.000626712 t_gen=0 t_check=0.001 ok=1 276 | RESULT algo=stdsort name=sorted size=2048 iters=1*1000 time=0.012008 stddev=0.000141266 t_gen=0 t_check=0 ok=1 277 | RESULT algo=ssssort name=sorted size=4096 iters=1*1000 time=0.048707 stddev=0.000851984 t_gen=0.001 t_check=0.003 ok=1 278 | RESULT algo=stdsort name=sorted size=4096 iters=1*1000 time=0.026009 stddev=0.000122204 t_gen=0.001 t_check=0 ok=1 279 | RESULT algo=ssssort name=sorted size=8192 iters=1*1000 time=0.090873 stddev=0.00120595 t_gen=0.003 t_check=0.006 ok=1 280 | RESULT algo=stdsort name=sorted size=8192 iters=1*1000 time=0.057048 stddev=0.000777368 t_gen=0.003 t_check=0 ok=1 281 | RESULT algo=ssssort name=sorted size=16384 iters=1*500 time=0.183122 stddev=0.00216529 t_gen=0.007 t_check=0.013 ok=1 282 | RESULT algo=stdsort name=sorted size=16384 iters=1*500 time=0.123048 stddev=0.00107331 t_gen=0.007 t_check=0 ok=1 283 | RESULT algo=ssssort name=sorted size=32768 iters=1*500 time=0.523564 stddev=0.0195546 t_gen=0.035 t_check=0.026 ok=1 284 | RESULT algo=stdsort name=sorted size=32768 iters=1*500 time=0.34747 stddev=0.0199286 t_gen=0.035 t_check=0 ok=1 285 | RESULT algo=ssssort name=sorted size=65536 iters=1*250 time=1.34502 stddev=0.0539137 t_gen=0.071 t_check=0.052 ok=1 286 | RESULT algo=stdsort name=sorted size=65536 iters=1*250 time=0.56068 stddev=0.00290851 t_gen=0.071 t_check=0 ok=1 287 | RESULT algo=ssssort name=sorted size=131072 iters=1*250 time=3.19172 stddev=0.0933399 t_gen=0.145 t_check=0.105 ok=1 288 | RESULT algo=stdsort name=sorted size=131072 iters=1*250 time=1.18647 stddev=0.00389537 t_gen=0.145 t_check=0 ok=1 289 | RESULT algo=ssssort name=sorted size=262144 iters=1*100 time=6.53489 stddev=0.0773401 t_gen=0.301 t_check=0.21 ok=1 290 | RESULT algo=stdsort name=sorted size=262144 iters=1*100 time=2.50558 stddev=0.0074144 t_gen=0.301 t_check=0 ok=1 291 | RESULT algo=ssssort name=sorted size=524288 iters=1*100 time=11.0388 stddev=0.0303554 t_gen=0.593 t_check=0.443 ok=1 292 | RESULT algo=stdsort name=sorted size=524288 iters=1*100 time=5.29314 stddev=0.00909325 t_gen=0.593 t_check=0 ok=1 293 | RESULT algo=ssssort name=sorted size=1048576 iters=1*100 time=20.2542 stddev=0.022832 t_gen=1.374 t_check=0.971 ok=1 294 | RESULT algo=stdsort name=sorted size=1048576 iters=1*100 time=11.1876 stddev=0.0849145 t_gen=1.374 t_check=0 ok=1 295 | RESULT algo=ssssort name=sorted size=2097152 iters=1*100 time=38.7707 stddev=0.0385084 t_gen=2.833 t_check=2.102 ok=1 296 | RESULT algo=stdsort name=sorted size=2097152 iters=1*100 time=23.5182 stddev=0.0441156 t_gen=2.833 t_check=0 ok=1 297 | RESULT algo=ssssort name=sorted size=4194304 iters=1*100 time=77.1495 stddev=0.179346 t_gen=8.41 t_check=4.086 ok=1 298 | RESULT algo=stdsort name=sorted size=4194304 iters=1*100 time=49.6215 stddev=0.200218 t_gen=8.41 t_check=0 ok=1 299 | RESULT algo=ssssort name=sorted size=8388608 iters=1*100 time=198.112 stddev=0.398769 t_gen=16.734 t_check=8.401 ok=1 300 | RESULT algo=stdsort name=sorted size=8388608 iters=1*100 time=104.217 stddev=0.132021 t_gen=16.734 t_check=0 ok=1 301 | RESULT algo=ssssort name=sorted size=16777216 iters=1*100 time=465.896 stddev=20.3053 t_gen=33.645 t_check=15.748 ok=1 302 | RESULT algo=stdsort name=sorted size=16777216 iters=1*100 time=218.664 stddev=0.4083 t_gen=33.645 t_check=0 ok=1 303 | RESULT algo=ssssort name=sorted size=33554432 iters=1*100 time=1041.36 stddev=1.46404 t_gen=67.672 t_check=32.568 ok=1 304 | RESULT algo=stdsort name=sorted size=33554432 iters=1*100 time=456.419 stddev=0.71399 t_gen=67.672 t_check=0 ok=1 305 | RESULT algo=ssssort name=sorted size=67108864 iters=1*100 time=2129.21 stddev=83.9663 t_gen=134.171 t_check=65.578 ok=1 306 | RESULT algo=stdsort name=sorted size=67108864 iters=1*100 time=952.638 stddev=1.93841 t_gen=134.171 t_check=0 ok=1 307 | RESULT algo=ssssort name=reverse size=1024 iters=1*1000 time=0.020522 stddev=0.00056554 t_gen=0 t_check=0 ok=1 308 | RESULT algo=stdsort name=reverse size=1024 iters=1*1000 time=0.004002 stddev=6.32456e-05 t_gen=0 t_check=0 ok=1 309 | RESULT algo=ssssort name=reverse size=2048 iters=1*1000 time=0.033555 stddev=0.000997983 t_gen=0 t_check=0.001 ok=1 310 | RESULT algo=stdsort name=reverse size=2048 iters=1*1000 time=0.009004 stddev=7.7395e-05 t_gen=0 t_check=0 ok=1 311 | RESULT algo=ssssort name=reverse size=4096 iters=1*1000 time=0.07326 stddev=0.00228158 t_gen=0.001 t_check=0.003 ok=1 312 | RESULT algo=stdsort name=reverse size=4096 iters=1*1000 time=0.020636 stddev=0.000495732 t_gen=0.001 t_check=0 ok=1 313 | RESULT algo=ssssort name=reverse size=8192 iters=1*1000 time=0.198462 stddev=0.00560767 t_gen=0.003 t_check=0.006 ok=1 314 | RESULT algo=stdsort name=reverse size=8192 iters=1*1000 time=0.045007 stddev=0.000113859 t_gen=0.003 t_check=0 ok=1 315 | RESULT algo=ssssort name=reverse size=16384 iters=1*500 time=0.535512 stddev=0.00817579 t_gen=0.018 t_check=0.013 ok=1 316 | RESULT algo=stdsort name=reverse size=16384 iters=1*500 time=0.096112 stddev=0.000837172 t_gen=0.018 t_check=0 ok=1 317 | RESULT algo=ssssort name=reverse size=32768 iters=1*500 time=0.627018 stddev=0.0256127 t_gen=0.036 t_check=0.026 ok=1 318 | RESULT algo=stdsort name=reverse size=32768 iters=1*500 time=0.204996 stddev=0.00138558 t_gen=0.036 t_check=0 ok=1 319 | RESULT algo=ssssort name=reverse size=65536 iters=1*250 time=1.30959 stddev=0.0441885 t_gen=0.073 t_check=0.052 ok=1 320 | RESULT algo=stdsort name=reverse size=65536 iters=1*250 time=0.434644 stddev=0.00215961 t_gen=0.073 t_check=0 ok=1 321 | RESULT algo=ssssort name=reverse size=131072 iters=1*250 time=2.88858 stddev=0.0611811 t_gen=0.15 t_check=0.108 ok=1 322 | RESULT algo=stdsort name=reverse size=131072 iters=1*250 time=0.98762 stddev=0.0705435 t_gen=0.15 t_check=0 ok=1 323 | RESULT algo=ssssort name=reverse size=262144 iters=1*100 time=6.55002 stddev=0.0960087 t_gen=0.308 t_check=0.21 ok=1 324 | RESULT algo=stdsort name=reverse size=262144 iters=1*100 time=1.95403 stddev=0.00662175 t_gen=0.308 t_check=0 ok=1 325 | RESULT algo=ssssort name=reverse size=524288 iters=1*100 time=12.6285 stddev=0.0398392 t_gen=0.617 t_check=0.464 ok=1 326 | RESULT algo=stdsort name=reverse size=524288 iters=1*100 time=4.13335 stddev=0.048656 t_gen=0.617 t_check=0 ok=1 327 | RESULT algo=ssssort name=reverse size=1048576 iters=1*100 time=26.665 stddev=0.0749408 t_gen=1.38 t_check=0.931 ok=1 328 | RESULT algo=stdsort name=reverse size=1048576 iters=1*100 time=8.65774 stddev=0.0298202 t_gen=1.38 t_check=0 ok=1 329 | RESULT algo=ssssort name=reverse size=2097152 iters=1*100 time=63.6559 stddev=0.13571 t_gen=2.912 t_check=2.076 ok=1 330 | RESULT algo=stdsort name=reverse size=2097152 iters=1*100 time=18.0911 stddev=0.0274533 t_gen=2.912 t_check=0 ok=1 331 | RESULT algo=ssssort name=reverse size=4194304 iters=1*100 time=152.407 stddev=0.588847 t_gen=8.658 t_check=4.335 ok=1 332 | RESULT algo=stdsort name=reverse size=4194304 iters=1*100 time=38.2367 stddev=0.0980448 t_gen=8.658 t_check=0 ok=1 333 | RESULT algo=ssssort name=reverse size=8388608 iters=1*100 time=225.92 stddev=1.1601 t_gen=17.236 t_check=7.993 ok=1 334 | RESULT algo=stdsort name=reverse size=8388608 iters=1*100 time=82.2737 stddev=7.51975 t_gen=17.236 t_check=0 ok=1 335 | RESULT algo=ssssort name=reverse size=16777216 iters=1*100 time=457.662 stddev=0.798614 t_gen=34.352 t_check=15.772 ok=1 336 | RESULT algo=stdsort name=reverse size=16777216 iters=1*100 time=167.107 stddev=0.200763 t_gen=34.352 t_check=0 ok=1 337 | RESULT algo=ssssort name=reverse size=33554432 iters=1*100 time=992.291 stddev=35.2011 t_gen=68.722 t_check=38.564 ok=1 338 | RESULT algo=stdsort name=reverse size=33554432 iters=1*100 time=401.384 stddev=1.87879 t_gen=68.722 t_check=0 ok=1 339 | RESULT algo=ssssort name=reverse size=67108864 iters=1*100 time=2119.54 stddev=68.5177 t_gen=154.784 t_check=65.374 ok=1 340 | RESULT algo=stdsort name=reverse size=67108864 iters=1*100 time=730.882 stddev=30.623 t_gen=154.784 t_check=0 ok=1 341 | RESULT algo=ssssort name=many-dupes size=1024 iters=1*1000 time=0.015131 stddev=0.000471234 t_gen=0.007 t_check=0 ok=1 342 | RESULT algo=stdsort name=many-dupes size=1024 iters=1*1000 time=0.00508 stddev=0.00049985 t_gen=0.007 t_check=0 ok=1 343 | RESULT algo=ssssort name=many-dupes size=2048 iters=1*1000 time=0.024427 stddev=0.000506881 t_gen=0.015 t_check=0.001 ok=1 344 | RESULT algo=stdsort name=many-dupes size=2048 iters=1*1000 time=0.021296 stddev=0.000700625 t_gen=0.015 t_check=0 ok=1 345 | RESULT algo=ssssort name=many-dupes size=4096 iters=1*1000 time=0.044521 stddev=0.000801375 t_gen=0.03 t_check=0.003 ok=1 346 | RESULT algo=stdsort name=many-dupes size=4096 iters=1*1000 time=0.04553 stddev=0.00103835 t_gen=0.03 t_check=0 ok=1 347 | RESULT algo=ssssort name=many-dupes size=8192 iters=1*1000 time=0.088806 stddev=0.000514421 t_gen=0.061 t_check=0.006 ok=1 348 | RESULT algo=stdsort name=many-dupes size=8192 iters=1*1000 time=0.131224 stddev=0.000835777 t_gen=0.061 t_check=0 ok=1 349 | RESULT algo=ssssort name=many-dupes size=16384 iters=1*500 time=0.198408 stddev=0.00082395 t_gen=0.133 t_check=0.013 ok=1 350 | RESULT algo=stdsort name=many-dupes size=16384 iters=1*500 time=0.28908 stddev=0.00264037 t_gen=0.133 t_check=0 ok=1 351 | RESULT algo=ssssort name=many-dupes size=32768 iters=1*500 time=0.391992 stddev=0.00431894 t_gen=0.266 t_check=0.026 ok=1 352 | RESULT algo=stdsort name=many-dupes size=32768 iters=1*500 time=0.588146 stddev=0.00651103 t_gen=0.266 t_check=0 ok=1 353 | RESULT algo=ssssort name=many-dupes size=65536 iters=1*250 time=0.777524 stddev=0.00175892 t_gen=0.533 t_check=0.053 ok=1 354 | RESULT algo=stdsort name=many-dupes size=65536 iters=1*250 time=0.483008 stddev=0.00364597 t_gen=0.533 t_check=0 ok=1 355 | RESULT algo=ssssort name=many-dupes size=131072 iters=1*250 time=1.47604 stddev=0.0350133 t_gen=1.069 t_check=0.136 ok=1 356 | RESULT algo=stdsort name=many-dupes size=131072 iters=1*250 time=2.49062 stddev=0.0449233 t_gen=1.069 t_check=0 ok=1 357 | RESULT algo=ssssort name=many-dupes size=262144 iters=1*100 time=2.96147 stddev=0.00819516 t_gen=2.138 t_check=0.209 ok=1 358 | RESULT algo=stdsort name=many-dupes size=262144 iters=1*100 time=5.10603 stddev=0.00570921 t_gen=2.138 t_check=0 ok=1 359 | RESULT algo=ssssort name=many-dupes size=524288 iters=1*100 time=6.21749 stddev=0.0182743 t_gen=4.288 t_check=0.448 ok=1 360 | RESULT algo=stdsort name=many-dupes size=524288 iters=1*100 time=10.3992 stddev=0.126304 t_gen=4.288 t_check=0 ok=1 361 | RESULT algo=ssssort name=many-dupes size=1048576 iters=1*100 time=13.5719 stddev=0.013106 t_gen=8.81 t_check=0.963 ok=1 362 | RESULT algo=stdsort name=many-dupes size=1048576 iters=1*100 time=19.1379 stddev=0.0792962 t_gen=8.81 t_check=0 ok=1 363 | RESULT algo=ssssort name=many-dupes size=2097152 iters=1*100 time=27.1729 stddev=0.0579009 t_gen=17.523 t_check=2.086 ok=1 364 | RESULT algo=stdsort name=many-dupes size=2097152 iters=1*100 time=44.1818 stddev=0.0746734 t_gen=17.523 t_check=0 ok=1 365 | RESULT algo=ssssort name=many-dupes size=4194304 iters=1*100 time=55.974 stddev=0.0649723 t_gen=35.27 t_check=4.074 ok=1 366 | RESULT algo=stdsort name=many-dupes size=4194304 iters=1*100 time=92.0805 stddev=0.273529 t_gen=35.27 t_check=0 ok=1 367 | RESULT algo=ssssort name=many-dupes size=8388608 iters=1*100 time=123.428 stddev=0.144907 t_gen=70.406 t_check=7.966 ok=1 368 | RESULT algo=stdsort name=many-dupes size=8388608 iters=1*100 time=192.119 stddev=14.4423 t_gen=70.406 t_check=0 ok=1 369 | RESULT algo=ssssort name=many-dupes size=16777216 iters=1*100 time=267.5 stddev=0.203405 t_gen=140.805 t_check=14.983 ok=1 370 | RESULT algo=stdsort name=many-dupes size=16777216 iters=1*100 time=313.302 stddev=0.249292 t_gen=140.805 t_check=0 ok=1 371 | RESULT algo=ssssort name=many-dupes size=33554432 iters=1*100 time=521.486 stddev=0.256542 t_gen=302.626 t_check=33.082 ok=1 372 | RESULT algo=stdsort name=many-dupes size=33554432 iters=1*100 time=791.061 stddev=28.864 t_gen=302.626 t_check=0 ok=1 373 | RESULT algo=ssssort name=many-dupes size=67108864 iters=1*100 time=1069.12 stddev=0.517917 t_gen=604.792 t_check=65.895 ok=1 374 | RESULT algo=stdsort name=many-dupes size=67108864 iters=1*100 time=1628.71 stddev=0.507514 t_gen=604.792 t_check=0 ok=1 375 | RESULT algo=ssssort name=few-spikes-with-noise size=1024 iters=1*1000 time=0.015264 stddev=0.00051824 t_gen=0.017 t_check=0 ok=1 376 | RESULT algo=stdsort name=few-spikes-with-noise size=1024 iters=1*1000 time=0.006683 stddev=0.000681891 t_gen=0.017 t_check=0 ok=1 377 | RESULT algo=ssssort name=few-spikes-with-noise size=2048 iters=1*1000 time=0.025007 stddev=0.000762583 t_gen=0.035 t_check=0.001 ok=1 378 | RESULT algo=stdsort name=few-spikes-with-noise size=2048 iters=1*1000 time=0.024281 stddev=0.000852506 t_gen=0.035 t_check=0 ok=1 379 | RESULT algo=ssssort name=few-spikes-with-noise size=4096 iters=1*1000 time=0.056276 stddev=0.00760511 t_gen=0.07 t_check=0.003 ok=1 380 | RESULT algo=stdsort name=few-spikes-with-noise size=4096 iters=1*1000 time=0.069172 stddev=0.000742271 t_gen=0.07 t_check=0 ok=1 381 | RESULT algo=ssssort name=few-spikes-with-noise size=8192 iters=1*1000 time=0.128476 stddev=0.010204 t_gen=0.142 t_check=0.006 ok=1 382 | RESULT algo=stdsort name=few-spikes-with-noise size=8192 iters=1*1000 time=0.167807 stddev=0.000861681 t_gen=0.142 t_check=0 ok=1 383 | RESULT algo=ssssort name=few-spikes-with-noise size=16384 iters=1*500 time=0.282728 stddev=0.0155434 t_gen=0.286 t_check=0.013 ok=1 384 | RESULT algo=stdsort name=few-spikes-with-noise size=16384 iters=1*500 time=0.39304 stddev=0.00229367 t_gen=0.286 t_check=0 ok=1 385 | RESULT algo=ssssort name=few-spikes-with-noise size=32768 iters=1*500 time=0.618346 stddev=0.0259267 t_gen=0.592 t_check=0.026 ok=1 386 | RESULT algo=stdsort name=few-spikes-with-noise size=32768 iters=1*500 time=0.84987 stddev=0.00238337 t_gen=0.592 t_check=0 ok=1 387 | RESULT algo=ssssort name=few-spikes-with-noise size=65536 iters=1*250 time=1.33096 stddev=0.0449349 t_gen=1.186 t_check=0.053 ok=1 388 | RESULT algo=stdsort name=few-spikes-with-noise size=65536 iters=1*250 time=1.82596 stddev=0.00447555 t_gen=1.186 t_check=0 ok=1 389 | RESULT algo=ssssort name=few-spikes-with-noise size=131072 iters=1*250 time=2.73471 stddev=0.0419756 t_gen=2.374 t_check=0.104 ok=1 390 | RESULT algo=stdsort name=few-spikes-with-noise size=131072 iters=1*250 time=4.02059 stddev=0.00702393 t_gen=2.374 t_check=0 ok=1 391 | RESULT algo=ssssort name=few-spikes-with-noise size=262144 iters=1*100 time=5.43851 stddev=0.0701729 t_gen=4.754 t_check=0.243 ok=1 392 | RESULT algo=stdsort name=few-spikes-with-noise size=262144 iters=1*100 time=8.86324 stddev=0.0830925 t_gen=4.754 t_check=0 ok=1 393 | RESULT algo=ssssort name=few-spikes-with-noise size=524288 iters=1*100 time=10.1746 stddev=0.120545 t_gen=9.607 t_check=0.452 ok=1 394 | RESULT algo=stdsort name=few-spikes-with-noise size=524288 iters=1*100 time=18.8374 stddev=0.0298549 t_gen=9.607 t_check=0 ok=1 395 | RESULT algo=ssssort name=few-spikes-with-noise size=1048576 iters=1*100 time=21.5436 stddev=0.332238 t_gen=19.125 t_check=0.968 ok=1 396 | RESULT algo=stdsort name=few-spikes-with-noise size=1048576 iters=1*100 time=39.7012 stddev=0.140904 t_gen=19.125 t_check=0 ok=1 397 | RESULT algo=ssssort name=few-spikes-with-noise size=2097152 iters=1*100 time=46.2256 stddev=0.325871 t_gen=38.295 t_check=2.077 ok=1 398 | RESULT algo=stdsort name=few-spikes-with-noise size=2097152 iters=1*100 time=84.3939 stddev=0.0841123 t_gen=38.295 t_check=0 ok=1 399 | RESULT algo=ssssort name=few-spikes-with-noise size=4194304 iters=1*100 time=98.4335 stddev=0.652438 t_gen=79.42 t_check=4.082 ok=1 400 | RESULT algo=stdsort name=few-spikes-with-noise size=4194304 iters=1*100 time=178.753 stddev=0.457126 t_gen=79.42 t_check=0 ok=1 401 | RESULT algo=ssssort name=few-spikes-with-noise size=8388608 iters=1*100 time=214.503 stddev=0.976376 t_gen=158.955 t_check=8.307 ok=1 402 | RESULT algo=stdsort name=few-spikes-with-noise size=8388608 iters=1*100 time=379.623 stddev=0.237589 t_gen=158.955 t_check=0 ok=1 403 | RESULT algo=ssssort name=few-spikes-with-noise size=16777216 iters=1*100 time=454.432 stddev=0.47699 t_gen=316.069 t_check=15.76 ok=1 404 | RESULT algo=stdsort name=few-spikes-with-noise size=16777216 iters=1*100 time=805.753 stddev=0.207873 t_gen=316.069 t_check=0 ok=1 405 | RESULT algo=ssssort name=few-spikes-with-noise size=33554432 iters=1*100 time=925.575 stddev=23.4444 t_gen=631.216 t_check=33.096 ok=1 406 | RESULT algo=stdsort name=few-spikes-with-noise size=33554432 iters=1*100 time=1712.68 stddev=0.316384 t_gen=631.216 t_check=0 ok=1 407 | RESULT algo=ssssort name=few-spikes-with-noise size=67108864 iters=1*100 time=1812.62 stddev=73.894 t_gen=1259.82 t_check=66.185 ok=1 408 | RESULT algo=stdsort name=few-spikes-with-noise size=67108864 iters=1*100 time=3566.16 stddev=53.4528 t_gen=1259.82 t_check=0 ok=1 409 | RESULT algo=ssssort name=ones size=1024 iters=1*1000 time=0.007411 stddev=0.000494291 t_gen=0 t_check=0 ok=1 410 | RESULT algo=stdsort name=ones size=1024 iters=1*1000 time=0.004 stddev=0 t_gen=0 t_check=0 ok=1 411 | RESULT algo=ssssort name=ones size=2048 iters=1*1000 time=0.012417 stddev=0.00049936 t_gen=0 t_check=0.001 ok=1 412 | RESULT algo=stdsort name=ones size=2048 iters=1*1000 time=0.009002 stddev=4.4699e-05 t_gen=0 t_check=0 ok=1 413 | RESULT algo=ssssort name=ones size=4096 iters=1*1000 time=0.023707 stddev=0.000520982 t_gen=0.001 t_check=0.003 ok=1 414 | RESULT algo=stdsort name=ones size=4096 iters=1*1000 time=0.019582 stddev=0.000552794 t_gen=0.001 t_check=0 ok=1 415 | RESULT algo=ssssort name=ones size=8192 iters=1*1000 time=0.049462 stddev=0.00103229 t_gen=0.003 t_check=0.006 ok=1 416 | RESULT algo=stdsort name=ones size=8192 iters=1*1000 time=0.043145 stddev=0.00167295 t_gen=0.003 t_check=0 ok=1 417 | RESULT algo=ssssort name=ones size=16384 iters=1*500 time=0.102826 stddev=0.00063287 t_gen=0.017 t_check=0.013 ok=1 418 | RESULT algo=stdsort name=ones size=16384 iters=1*500 time=0.094058 stddev=0.000287753 t_gen=0.017 t_check=0 ok=1 419 | RESULT algo=ssssort name=ones size=32768 iters=1*500 time=0.221246 stddev=0.000725629 t_gen=0.033 t_check=0.026 ok=1 420 | RESULT algo=stdsort name=ones size=32768 iters=1*500 time=0.203504 stddev=0.00121364 t_gen=0.033 t_check=0 ok=1 421 | RESULT algo=ssssort name=ones size=65536 iters=1*250 time=0.470732 stddev=0.00147134 t_gen=0.069 t_check=0.053 ok=1 422 | RESULT algo=stdsort name=ones size=65536 iters=1*250 time=0.43718 stddev=0.00115973 t_gen=0.069 t_check=0 ok=1 423 | RESULT algo=ssssort name=ones size=131072 iters=1*250 time=0.998256 stddev=0.0047986 t_gen=0.143 t_check=0.105 ok=1 424 | RESULT algo=stdsort name=ones size=131072 iters=1*250 time=0.934384 stddev=0.00214264 t_gen=0.143 t_check=0 ok=1 425 | RESULT algo=ssssort name=ones size=262144 iters=1*100 time=2.10628 stddev=0.00648398 t_gen=0.286 t_check=0.243 ok=1 426 | RESULT algo=stdsort name=ones size=262144 iters=1*100 time=1.98862 stddev=0.0041409 t_gen=0.286 t_check=0 ok=1 427 | RESULT algo=ssssort name=ones size=524288 iters=1*100 time=4.44063 stddev=0.0120375 t_gen=0.58 t_check=0.429 ok=1 428 | RESULT algo=stdsort name=ones size=524288 iters=1*100 time=4.21675 stddev=0.00844516 t_gen=0.58 t_check=0 ok=1 429 | RESULT algo=ssssort name=ones size=1048576 iters=1*100 time=9.40002 stddev=0.0164556 t_gen=1.312 t_check=0.962 ok=1 430 | RESULT algo=stdsort name=ones size=1048576 iters=1*100 time=8.92384 stddev=0.0179982 t_gen=1.312 t_check=0 ok=1 431 | RESULT algo=ssssort name=ones size=2097152 iters=1*100 time=19.8333 stddev=0.0377351 t_gen=2.843 t_check=2.077 ok=1 432 | RESULT algo=stdsort name=ones size=2097152 iters=1*100 time=18.8477 stddev=0.0248416 t_gen=2.843 t_check=0 ok=1 433 | RESULT algo=ssssort name=ones size=4194304 iters=1*100 time=42.2056 stddev=0.0647792 t_gen=8.161 t_check=4.076 ok=1 434 | RESULT algo=stdsort name=ones size=4194304 iters=1*100 time=39.7051 stddev=0.0433151 t_gen=8.161 t_check=0 ok=1 435 | RESULT algo=ssssort name=ones size=8388608 iters=1*100 time=88.0409 stddev=0.0800986 t_gen=16.076 t_check=8.197 ok=1 436 | RESULT algo=stdsort name=ones size=8388608 iters=1*100 time=83.3976 stddev=0.0487809 t_gen=16.076 t_check=0 ok=1 437 | RESULT algo=ssssort name=ones size=16777216 iters=1*100 time=183.321 stddev=0.114618 t_gen=32.354 t_check=16.272 ok=1 438 | RESULT algo=stdsort name=ones size=16777216 iters=1*100 time=174.794 stddev=0.0933617 t_gen=32.354 t_check=0 ok=1 439 | RESULT algo=ssssort name=ones size=33554432 iters=1*100 time=381.775 stddev=0.140153 t_gen=65.199 t_check=31.204 ok=1 440 | RESULT algo=stdsort name=ones size=33554432 iters=1*100 time=365.757 stddev=0.166132 t_gen=65.199 t_check=0 ok=1 441 | RESULT algo=ssssort name=ones size=67108864 iters=1*100 time=794.791 stddev=1.59226 t_gen=128.803 t_check=62.082 ok=1 442 | RESULT algo=stdsort name=ones size=67108864 iters=1*100 time=763.072 stddev=0.184165 t_gen=128.803 t_check=0 ok=1 443 | --------------------------------------------------------------------------------