├── .gitignore ├── CHANGELOG.md ├── README.md ├── diagram.svg ├── project.clj ├── src └── merge_insertion_sort │ └── core.cljc └── test └── merge_insertion_sort └── test └── core.clj /.gitignore: -------------------------------------------------------------------------------- 1 | target/ 2 | .lein-repl-history 3 | .nrepl-port 4 | -------------------------------------------------------------------------------- /CHANGELOG.md: -------------------------------------------------------------------------------- 1 | ## 1.0.2 2 | 3 | Fix "already refers to" warning for fn->comparator when using ClojureScript. 4 | 5 | ## 1.0.1 6 | 7 | Add support for ClojureScript 8 | 9 | ## 1.0.0 10 | 11 | Initial release 12 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Merge Insertion Sort / Ford-Johnson Algorithm 2 | 3 | [![Clojars Project](https://img.shields.io/clojars/v/decidedlyso/merge-insertion-sort.svg)](https://clojars.org/decidedlyso/merge-insertion-sort) 4 | 5 | This Clojure / ClojureScript library implements the [Merge Insertion sorting algorithm](https://en.wikipedia.org/wiki/Merge-insertion_sort) (also known as the Ford-Johnson Algorithm). 6 | 7 | Merge Insertion Sort is a [comparison sort](https://en.wikipedia.org/wiki/Comparison_sort) that minimizes the worst-case number of comparisons for small N (and has been proven optimal for N < 18, and likely optimal for N < 47). 8 | 9 | This algorithm will not make your programs faster. Even though it has a better worst-case for small N than most other algorithms, its implementation results in worse performance on actual computers than simpler algorithms (like quicksort, timsort and merge sort). 10 | 11 | However, in situations where the cost-of-comparison greatly outweighs the cost-of-overhead of the algorithm, Merge Insertion Sort can be useful. For example, this implemention was motivated by wanting to minimize the number of pairwise comparisons for an (~infinitely-slow) human to sort a list. 12 | 13 | It is also worth noting that just because Merge Insertion Sort has the best worst case (for small N), it doesn't mean it has the best average case (which, as far as I am aware, is an open problem). 14 | 15 | ## Installation 16 | 17 | Add the following dependency to your `project.clj`: 18 | 19 | ```clojure 20 | [decidedlyso/merge-insertion-sort "1.0.2"] 21 | ``` 22 | 23 | ## Usage 24 | 25 | As with Clojure's built-in `sort`, you can optionally provide a comparator function that returns `-1/0/+1` or `true/false`: 26 | 27 | ```clojure 28 | (ns example 29 | (:require 30 | [merge-insertion-sort.core :as mi])) 31 | 32 | => (mi/sort [1 5 3 4 2 7 6 10 9 8]) 33 | (1 2 3 4 5 6 7 8 9 10) 34 | 35 | => (mi/sort > [1 5 3 4 2 7 6 10 9 8]) 36 | (10 9 8 7 6 5 4 3 2 1) 37 | 38 | => (let [comparisons (atom 0)] 39 | (mi/sort (fn [a b] 40 | (swap! comparisons inc) 41 | (compare a b)) 42 | (shuffle [1 2 3 4 5 6 7 8 9 10])) 43 | @comparisons) 44 | 22 45 | ``` 46 | 47 | 48 | ## Background 49 | 50 | Merge Insertion Sort was initially described by Ford and Johnson in 1959 (Ford 1959), but only took on its name when it was featured in the Art of Computer Programming (Knuth 1968). It is often referred to as the Ford-Johnson Algorithm. 51 | 52 | The algorithm combines merging (like in merge-sort) and binary-search-insertion (like in insertion-sort), but, it is able to achieve better worst-case performance by better selecting which elements to compare, so as to maximize the efficiency of performing binary-search-insertion. 53 | 54 | The key insight that underlies Merge Insertion Sort, is that it costs the same to perform binary-search-insertion on a list of `N = 2^K` as on a list of `N = 2^(K+1)-1`. For example, the worst-case for binary-search-insertion for `N = 8` is `floor(log2(N)) = 3`, and it is the same for `N = 9 to 15`. When given a choice between elements to compare, the algorithm prefers pairs that would require inserting an element into a list of `N = 2^K-1` (ie. a list with length one less than a power of 2), because it can insert that element for one less comparison than if that insertion were to be made one insertion later. 55 | 56 | It turns out that the order of such comparisons can be determined by an integer progression called the [Jacobsthal numbers](https://en.wikipedia.org/wiki/Jacobsthal_number), when optimizing for the worst-case. 57 | 58 | 59 | ## The Algorithm 60 | 61 | The Merge Insertion Sort algorithm is as follows: 62 | 63 | 1. Given an unsorted list, group the list into pairs. If the list is odd, the last element is unpaired. 64 | 2. Each pair is sorted (using a single comparison each) into what we will call [a b] pairs. 65 | 3. The pairs are sorted recursively based on the `a` of each, and we call the pairs [a1 b1], [a2 b2] etc. If the list was odd, the unpaired element is considered the last `b`. 66 | 4. We call the chain of `a`s the "main-chain". 67 | 68 | At this point, we could take any of the `b`s and use binary-search-insertion to insert that `b` into the main-chain (which starts of as just the `a`s). When inserting, we only need to consider the values "left" of the `b` in question (for example, when inserting `b4` we only need to consider the chain up to and including `a3`). 69 | 70 | We could insert the `b`s in order (`b1`, `b2` ...), but the "key insight" from above suggest otherwise. Different `b`s have different worst-case costs to insert into the main-chain (worst case cost for binary-search-insertion is floor(log2(N) where N is the length of the relevant part of the main-chain). We can minimize the cost by following an order based on the Jacobsthal Numbers: *1* *3* 2 *5* 4 *11* 10 9 8 7 6 *21* 20 19 18... (ignoring values which are greater than the `b`s we have). 71 | 72 | And so, we insert the `b`s, one at a time, into the main-chain following the above progression, eventually resulting in a sorted list. 73 | 74 | Below is an example that walks through the above steps: 75 | 76 | ![example diagram](./diagram.svg) 77 | 78 | If the above explanation was confusing, peruse the code, or try: (Ford 1959), (Knuth 1968), (Ayala-Rincon 2007), or (JaakkoK 2009). A C++ implentation also exists (Morwenn Github 2016) with an explanation by the author (Morwenn StackExchange 2016). 79 | 80 | 81 | ## Performance 82 | 83 | From information theory, the lower bound for the minimum comparisons needed to sort a list is: `⌈log2(N!)⌉` 84 | 85 | As derived in (Knuth 1968), the worst case comparisons for Merge Insertion Sort is: `sum(k=1..N) of ⌈log2(3k/4)⌉`. 86 | 87 | For comparison, worst case for binary-search-insertion is `sum(k=1..N) of ⌈log2(k)⌉`. 88 | 89 | In a table, this gives: 90 | 91 | | n | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 92 | | --------------------------------------------- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | 93 | | worst case lower bound | 0 | 1 | 3 | 5 | 7 | 10 | 13 | 16 | 19 | 22 | 26 | 29 | 33 | 37 | 41 | 45 | 49 | 53 | 57 | 62 | 66 | 70 | 94 | | merge-insertion worst case | 0 | 1 | 3 | 5 | 7 | 10 | 13 | 16 | 19 | 22 | 26 | 30 | 34 | 38 | 42 | 46 | 50 | 54 | 58 | 62 | 66 | 71 | 95 | | merge-insertion optimal by information theory | * | * | * | * | * | * | * | * | * | * | * | | | | | | | | | * | * | | 96 | | merge-insertion optimal by exhaustive search | | | | | | | | | | | | * | * | * | * | * | * | * | | | | * | 97 | | binary-search-insertion worst case | 0 | 1 | 3 | 5 | 8 | 11 | 14 | 17 | 21 | 25 | 29 | 33 | 37 | 41 | 34 | 49 | 54 | 59 | 64 | 69 | 74 | 79 | 98 | | `⌈N*log2(N)⌉` | 0 | 2 | 5 | 8 | 12 | 16 | 20 | 24 | 29 | 34 | 39 | 44 | 49 | 54 | 59 | 64 | 70 | 76 | 81 | 87 | 93 | 99 | 99 | 100 | Merge Insertion Sort's worst-case is equal to the information theoretic minimum, and thus optimal, for N = 1 to 11, 20 and 21. 101 | 102 | Via exhaustive search on computers, Merge Insertion Sort was proven optimal for N = 12 (Wells 1965); 13 (Kasai 1994); 14, 15, 22 (Peczarski 2004); 16, 17, 18 (Stober & Weiss 2023). 103 | 104 | Whether it is optimal for N = 19 is still an open question, but no better algorithm has been found for N < 47 and it has been proven that it would need to rely on a completely different strategy (Peczarski 2007). There are modified algorithms that perform similarly to Merge Insertion Sort (Ayala-Rincon 2007), and ones that are better at N > 47 (Manacher 1979, Bui 1985, Manacher 1989). 105 | 106 | The minimum number of comparisons is also tracked in a list in the Online Encyclopedia of Integer Sequences (Sloane 2017). 107 | 108 | 109 | ## References 110 | 111 | - __(Ayala-Rincon 2007)__ 112 | M. Ayala-Rincon, B. T. de Abreu, J. de Sequira, A variant of the Ford-Johnson algorithm that is more space efficient. Inf. Proc. Letters, 102, 5 (2007) 201–207. 113 | https://www.researchgate.net/publication/222571621_A_variant_of_the_Ford-Johnson_algorithm_that_is_more_space_efficient 114 | 115 | - __(Bui 1985)__ 116 | T. D. Bui, M. Thanh, Significant improvements to the Ford-Johnson algorithm for sorting, BIT, 25 (1985) 70–75. 117 | https://link.springer.com/article/10.1007/BF01934989 118 | 119 | - __(Ford 1959)__ 120 | L. Ford, S. Johnson, A tournament problem, American Mathematical Monthly 66 (1959) 387–389. 121 | https://www.jstor.org/stable/2308750 122 | 123 | - __(JaakkoK 2009)__ 124 | https://stackoverflow.com/questions/1935194/sorting-an-array-with-minimal-number-of-comparisons/1935353#1935353 125 | 126 | - __(Kasai 1994)__ 127 | T. Kasai, S. Sawato, S. Iwata, Thirty Four Comparisons Are Required to Sort 13 Items, LNCS, 792, (1994) 260–269. 128 | 129 | - __(Knuth 1968)__ 130 | D. Knuth, The Art of Computer Programming, Volume 3, Section 5.3.1 (1968). 131 | Relevant extract: https://www2.warwick.ac.uk/fac/sci/dcs/teaching/material/cs341/FJ.pdf 132 | 133 | - __(Manacher 1979)__ 134 | G. K. Manacher, The Ford-Johnson sorting algorithm is not optimal. J. ACM, 26, 3 (1979) 441–456. 135 | 136 | - __(Manacher 1989)__ 137 | G. K. Manacher, T. D. Bui, T. Mai, Optimal combinations of sorting and merging, J. ACM 36 (1989) 290–334. 138 | 139 | - __(Morwenn Github 2016)__ 140 | https://github.com/Morwenn/cpp-sort/blob/master/include/cpp-sort/detail/merge_insertion_sort.h 141 | 142 | - __(Morwenn StackExchange 2016)__ 143 | https://codereview.stackexchange.com/questions/116367/ford-johnson-merge-insertion-sort 144 | 145 | - __(Peczarski 2004)__ 146 | M. Peczarski, New Results in Minimum-Comparison Sorting, Algorithmica, 40, 2 (2004) 133-145. 147 | 148 | - __(Peczarski 2007)__ 149 | M. Peczarski, The Ford-Johnson algorithm still unbeaten for less than 47 elements, Inf. Proc. Letters, 101, 3 (2007) 126–128. 150 | 151 | - __(Peczarski 2012)__ 152 | M. Peczarski, Towards optimal sorting of 16 elements. Acta Univ. Sapientiae, Inform. 4, 2 (2012) 215–224. 153 | https://arxiv.org/pdf/1108.0866.pdf 154 | 155 | - __(Sloane 2017)__ 156 | N. J. A. Sloane, Sorting numbers: minimal number of comparisons needed to sort n elements, The Online Encyclopedia of Integer Sequences. 157 | https://oeis.org/A036604 158 | 159 | - __(Stober & Weiss 2023)__ 160 | F. Stober, A. Weiss, Lower bounds for sorting 16, 17, and 18 elements, 2023 Proc. Symp. Algorithm Engineering and Experiments, (2023) 201-213. 161 | https://epubs.siam.org/doi/10.1137/1.9781611977561.ch17 162 | 163 | - __(Wells 1965)__ 164 | M. Wells, Applications of a Language for Computing in Combinatorics, Information Processing, 65, (1965) 497–498. 165 | -------------------------------------------------------------------------------- /diagram.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 19 | 21 | 29 | 35 | 36 | 44 | 50 | 51 | 59 | 65 | 66 | 75 | 81 | 82 | 91 | 97 | 98 | 106 | 112 | 113 | 121 | 127 | 128 | 137 | 143 | 144 | 153 | 159 | 160 | 168 | 174 | 175 | 183 | 189 | 190 | 198 | 204 | 205 | 213 | 219 | 220 | 228 | 234 | 235 | 243 | 249 | 250 | 258 | 264 | 265 | 273 | 279 | 280 | 288 | 294 | 295 | 303 | 309 | 310 | 318 | 324 | 325 | 333 | 339 | 340 | 348 | 354 | 355 | 364 | 370 | 371 | 379 | 385 | 386 | 394 | 400 | 401 | 410 | 416 | 417 | 426 | 432 | 433 | 441 | 447 | 448 | 457 | 463 | 464 | 473 | 479 | 480 | 488 | 494 | 495 | 504 | 510 | 511 | 520 | 526 | 527 | 535 | 541 | 542 | 550 | 556 | 557 | 566 | 572 | 573 | 581 | 587 | 588 | 597 | 603 | 604 | 613 | 619 | 620 | 629 | 635 | 636 | 645 | 651 | 652 | 661 | 667 | 668 | 677 | 683 | 684 | 692 | 698 | 699 | 707 | 713 | 714 | 722 | 728 | 729 | 737 | 743 | 744 | 745 | 768 | 770 | 771 | 773 | image/svg+xml 774 | 776 | 777 | 778 | 779 | 780 | 785 | worst case # of comparisons 801 | 804 | 807 | 5 2 3 1 4 7 6 818 | 825 | SPLIT INTO PAIRS 836 | 841 | 846 | 851 | 852 | 853 | 856 | 863 | 872 | 881 | 890 | SORT 901 | SORT 912 | SORT 923 | 4 934 | 2 945 | 1 956 | 7 967 | 5 978 | 3 989 | 6 1000 | 1001 | 3 1012 | 3 1023 | 0 1034 | 1037 | 1044 | 1049 | 1054 | 4 1065 | 2 1076 | 1 1087 | 7 1098 | 5 1109 | 6 1120 | 1125 | 1130 | 1135 | 3 1146 | 1151 | 1152 |   1165 | 2 1176 | 2 1187 | 3 1198 | total worst-case # of comparisons: 13 1209 | 1212 | 1217 | 1222 | 1227 | 4 1238 | 2 1249 | 1 1260 | 7 1271 | 5 1282 | 3 1293 | 6 1304 | 1311 | 1320 | SORT 1331 | 1332 | 1335 | 1340 | 1345 | 1350 | 1355 | 4 1366 | 2 1377 | 1 1388 | 7 1399 | 5 1410 | 3 1421 | 6 1432 | 1437 | 1444 | target chain lengthinsertion cost 1460 | 22 1476 | 32 1492 | 43 1508 | 1514 | B2 1527 | B3 1540 | B4 1553 | A2 1566 | A3 1579 | 1584 | 1593 | INSERT 1604 | 1605 | 1608 | 1613 | 1618 | 1623 | 1628 | 1633 | 4 1644 | 2 1655 | 1 1666 | 7 1677 | 5 1688 | 3 1699 | 6 1710 | 00 1726 | 22 1742 | 32 1758 | 1765 | target chain lengthinsertion cost 1781 | 11 1797 | 1803 | B1 1816 | B2 1829 | B3 1842 | B4 1855 | A1 1868 | A2 1881 | A3 1894 | 1900 | INSERT 1911 | 1920 | 1921 | 1924 | 1929 | 1934 | 1939 | 3 1950 | 2 1961 | 1 1972 | 7 1983 | 4 1994 | 5 2005 | 6 2016 | 2021 | 2028 | target chain lengthinsertion cost 2044 | 32  2065 | 53 2081 | 2086 | 2092 | B2 2105 | B4 2118 | A2 2131 | 2136 | 2145 | INSERT 2156 | 2157 | 2160 | 2165 | 2170 | 3 2181 | 2 2192 | 1 2203 | 7 2214 | 4 2225 | 6 2236 | 5 2247 | 2252 | 2259 | target chain lengthinsertion cost 2275 | 63  2296 | 2301 | 2306 | 2312 | B4 2325 | 2330 | 2339 | INSERT 2350 | 2351 | 2352 | 2353 | -------------------------------------------------------------------------------- /project.clj: -------------------------------------------------------------------------------- 1 | (defproject decidedlyso/merge-insertion-sort "1.0.2" 2 | 3 | :description "Implementation of the comparison-efficient Merge Insertion Sort / Ford-Johnson Algorithm" 4 | 5 | :url "https://github.com/decidedlyso/merge-insertion-sort" 6 | 7 | :dependencies [[org.clojure/clojure "1.8.0"]] 8 | 9 | :main merge-insertion-sort.core) 10 | -------------------------------------------------------------------------------- /src/merge_insertion_sort/core.cljc: -------------------------------------------------------------------------------- 1 | (ns merge-insertion-sort.core 2 | (:refer-clojure :exclude [sort fn->comparator])) 3 | 4 | ; clojurescript has this in core, but clojure does not 5 | ; taken from cljs/core.cljs#L2383 6 | (defn fn->comparator 7 | "Given a fn that might be boolean valued or a comparator, 8 | return a fn that is a comparator." 9 | [f] 10 | (if (= f compare) 11 | compare 12 | (fn [x y] 13 | (let [r (f x y)] 14 | (if (number? r) 15 | r 16 | (if r 17 | -1 18 | (if (f y x) 1 0))))))) 19 | 20 | (defn jacobsthal 21 | "Return the nth Jacobsthal number (0-indexed)" 22 | [n] 23 | (Math/round (/ (+ (Math/pow 2 n) 24 | (Math/pow -1 (dec n))) 25 | 3))) 26 | 27 | (defn pending-element-order 28 | "Indexes to insert `b`s at, given `n` to insert (0-indexed)" 29 | [n] 30 | (-> (->> (range) 31 | (map jacobsthal) 32 | (take-while (partial > n))) 33 | (concat [n]) 34 | (->> 35 | (partition 2 1) 36 | (mapcat (fn [[a b]] 37 | (range b a -1))) 38 | (map dec)))) 39 | 40 | (defn binary-search-insertion-point 41 | "Return the index at which to insert `n` into sorted `coll`, 42 | using binary search. If `comp` not provided, uses compare." 43 | ([n coll] 44 | (binary-search-insertion-point compare n coll)) 45 | 46 | ([comp n coll] 47 | (binary-search-insertion-point comp n 0 (dec (count coll)) coll)) 48 | 49 | ([comp n lower-bound upper-bound coll] 50 | (if (not (<= lower-bound upper-bound)) 51 | lower-bound 52 | (let [comp (fn->comparator comp) 53 | mid-index (quot (+ lower-bound upper-bound) 2)] 54 | (case (comp n (nth coll mid-index)) 55 | 1 (binary-search-insertion-point comp n (inc mid-index) upper-bound coll) 56 | 0 mid-index 57 | -1 (binary-search-insertion-point comp n lower-bound (dec mid-index) coll)))))) 58 | 59 | (defn insert 60 | "Returns a new collection with `n` inserted at index `i` into `coll`." 61 | [coll n i] 62 | (concat (take i coll) [n] (drop i coll))) 63 | 64 | (defn binary-insert 65 | "Returns a new collection with `n` inserted into sorted `coll` 66 | so as to keep it sorted. If `comp` not provided, uses compare." 67 | ([n coll] 68 | (binary-insert compare n coll)) 69 | 70 | ([comp n coll] 71 | (let [comp (fn->comparator comp)] 72 | (insert coll n (binary-search-insertion-point comp n coll))))) 73 | 74 | (defn sort 75 | "Returns a sorted sequence of the items in coll, using the merge-insertion sorting algorithm. 76 | If `comp` not provided, uses compare." 77 | ([coll] 78 | (sort compare coll)) 79 | 80 | ([comp coll] 81 | (if (< (count coll) 2) 82 | coll 83 | (let [comp (fn->comparator comp) 84 | sorted-pairs (->> coll 85 | ; split into pairs [[a b] [a b] ...] 86 | ; ignore last element if coll length is odd (for now) 87 | (partition 2 2) 88 | ; sort each pair so that b "<" a 89 | (map (fn [pair] 90 | (if (< 0 (apply comp pair)) 91 | pair 92 | (reverse pair)))) 93 | ; recursively sort pairs by the `a`s 94 | ; so that [ [a1 b1] [a2 b2] ...] a1 "<" a2 95 | (sort (fn [x y] 96 | (comp (first x) (first y))))) 97 | 98 | ; initialize the `main-chain` with the `a`s from above 99 | ; and `pending-elements` with the `b`s from above 100 | ; if coll length is odd, add the (previously ignored) last element to pending-elements 101 | ; main chain: [a1 a2 a3 a4] 102 | ; pending elements: [b1 b2 b3 b4 bX] 103 | main-chain (atom (vec (map first sorted-pairs))) 104 | pending-elements (vec (map last sorted-pairs)) 105 | pending-elements (if (odd? (count coll)) 106 | (conj pending-elements (last coll)) 107 | pending-elements) 108 | 109 | ; initalize helper atom + fns to keep track of the positions of `a`s in main-chain 110 | ; so that we can always get the current main-chain up to a requested `a` 111 | ; (the positions of `a`s will change as pending-elements get inserted in the main-chain) 112 | a-positions (atom (vec (range (count @main-chain)))) 113 | update-a-positions (fn [a-positions i] 114 | (vec (map (fn [pos] 115 | (if (>= pos i) 116 | (inc pos) 117 | pos)) 118 | a-positions))) 119 | main-chain-until (fn [a-index] 120 | (take (get @a-positions a-index (count @main-chain)) @main-chain)) 121 | 122 | ; fn to insert the pending-element corresponding to `b-index` 123 | ; into the relevant part of the main chain (using binary search insertion) 124 | binary-insert! (fn [b-index] 125 | (let [n (pending-elements b-index) 126 | insert-index (binary-search-insertion-point comp n (main-chain-until b-index))] 127 | (swap! a-positions update-a-positions insert-index) 128 | (swap! main-chain insert n insert-index)))] 129 | 130 | ; binary-insert each pending-element into the main-chain 131 | ; in an order that maximizes binary search efficiency 132 | (doseq [i (pending-element-order (count pending-elements))] 133 | (binary-insert! i)) 134 | 135 | @main-chain)))) 136 | -------------------------------------------------------------------------------- /test/merge_insertion_sort/test/core.clj: -------------------------------------------------------------------------------- 1 | (ns merge-insertion-sort.test.core 2 | (:require 3 | [clojure.test :refer :all] 4 | [merge-insertion-sort.core :as lib])) 5 | 6 | (deftest binary-search-insertion-point 7 | (testing "empty" 8 | (is (= 0 (lib/binary-search-insertion-point 1 [])))) 9 | 10 | (testing "start" 11 | (is (= 0 (lib/binary-search-insertion-point 0 [1])))) 12 | 13 | (testing "end" 14 | (is (= 1 (lib/binary-search-insertion-point 2 [1]))) 15 | (is (= 2 (lib/binary-search-insertion-point 2 [0 1])))) 16 | 17 | (testing "middle odd" 18 | (is (= 1 (lib/binary-search-insertion-point 1 [0 2]))) 19 | (is (= 1 (lib/binary-search-insertion-point 1 [0 2 3 4])))) 20 | 21 | (testing "middle even" 22 | (is (= 1 (lib/binary-search-insertion-point 1 [0 2 3]))) 23 | (is (= 1 (lib/binary-search-insertion-point 1 [0 2 3 4 5]))))) 24 | 25 | (deftest insert 26 | (is (= [1] (lib/insert [] 1 0))) 27 | (is (= [1 2 3] (lib/insert [1 3] 2 1)))) 28 | 29 | (deftest binary-insert 30 | (is (= [0 1 2 3 4 5] (lib/binary-insert compare 4 [0 1 2 3 5]))) 31 | (is (= [4] (lib/binary-insert compare 4 []))) 32 | (is (= [1 2 3 4] (lib/binary-insert compare 4 [1 2 3])))) 33 | 34 | (deftest sort 35 | (let [target-max-comparisons [0 0 1 3 5 7 10 13 16 19 36 | 22 26 30 34 38 42 46 50 54 58 62 66 71] 37 | benchmark (fn [n] 38 | (let [sample-count 100 39 | cs (atom []) 40 | c (atom 0) 41 | errors (atom [])] 42 | (dotimes [_ sample-count] 43 | (let [c (atom 0) 44 | coll (range n) 45 | shuffled-coll (shuffle coll) 46 | result (lib/sort 47 | (fn [a b] 48 | (swap! c inc) 49 | (compare a b)) 50 | shuffled-coll)] 51 | (when (not= result coll) 52 | (swap! errors conj shuffled-coll)) 53 | (swap! cs conj @c))) 54 | {:errors @errors 55 | :max-comparisons (apply max @cs)}))] 56 | 57 | (doseq [i (range (count target-max-comparisons))] 58 | (let [{:keys [errors max-comparisons]} (benchmark i)] 59 | 60 | (testing (str "sorts properly for n = " i) 61 | (is (= [] errors))) 62 | 63 | (testing (str "counts optimally for n = " i) 64 | (is (= (target-max-comparisons i) max-comparisons)))))) 65 | 66 | (testing "sorts with repeating" 67 | (is (= [1 1 2 2 3 3 4 4] (lib/sort [1 2 3 4 4 3 2 1])))) 68 | 69 | (testing "sorts non-integers" 70 | (is (= [[3 1] [4 2]] 71 | (lib/sort 72 | (fn [x y] 73 | (compare (first x) (first y))) 74 | [[4 2] [3 1]]))))) 75 | --------------------------------------------------------------------------------